Extract rows from an XML file

Supported in: Batch

Reads a dataset of files and parses each XML file into rows.

Transform categories: File

Declared arguments

  • Dataset - Dataset of files to process.
    Files
  • Schema - Schema definition used when parsing the xml files.
    Type<Struct>
  • XML tag - XML tag that will be used as basis to generate one row per tag.
    Literal<String>
  • optional Attribute prefix - Prefix for attributes on tags.
    Literal<String>
  • optional Encoding - The encoding type (character set) of the input file.
    Enum<ISO_8859_1, UTF_8>
  • optional Value tag - The tag used for the value when there are attributes in the element having no child.
    Literal<String>