Flattening Complex XML Elements
Most stages in a dataflow require data to be in a flat format. This means that when
you read hierarchical data from an XML file into a dataflow, you will have to
flatten it if the data contains complex XML elements. A complex XML element is an
element that contain other elements or attributes. For example, in the data file the
<address>
element and the <account>
element
are complex XML elements:
<customers>
<customer>
<name>Sam</name>
<gender>M</gender>
<age>43</age>
<country>United States</country>
<address>
<addressline1>1253 Summer St.</addressline1>
<city>Boston</city>
<stateprovince>MA</stateprovince>
<postalcode>02110</postalcode>
</address>
<account>
<type>Savings</type>
<number>019922</number>
</account>
</customer>
<customer>
<name>Jeff</name>
<gender>M</gender>
<age>32</age>
<country>Canada</country>
<address>
<addressline1>26 Wellington St.</addressline1>
<city>Toronto</city>
<stateprovince>ON</stateprovince>
<postalcode>M5E 1S2</postalcode>
</address>
<account>
<type>Checking</type>
<number>238832</number>
</account>
</customer>
<customer>
<name>Mary</name>
<gender>F</gender>
<age>61</age>
<country>Australia</country>
<address>
<addressline1>Level 7, 1 Elizabeth Plaza</addressline1>
<city>North Sydney</city>
<stateprovince>NSW</stateprovince>
<postalcode>2060</postalcode>
</address>
<account>
<type>Savings</type>
<number>839938</number>
</account>
</customer>
</customers>
This procedure describes how to use Splitter stages to flatten XML data containing multiple complex XML elements.
- Add a Read from XML stage to your data flow and configure the stage. For more information, see Read From XML.
- Add a Broadcaster stage and connect Read from XML to it.
- Add a Splitter stage for each complex XML element in your data.
- Connect the Broadcaster stage to each Splitter.
-
Add a Record Combiner stage and connect each Splitter to it.
You should now have a data flow that looks like this:
- Double-click the first Splitter stage to open the stage options.
- In the Split at field, select one of the complex fields. In the example data file above, this could be the address field.
- Click OK.
- Configure each additional Splitter stage, selecting a different complex XML element in each Splitter's Split at field.
The data flow is now configured to take XML input containing records with complex XML elements and flatten the data. The resulting records from Record Combiner can be sent to any stage that requires flat data. For example, you could attached the Record Combiner stage to a Validate Address stage for address validation.