On this Page

Snap type:

Parse

Description:

This Snap parses the incoming XML data into SnapLogic document objects.

Prerequisites:

[None]

Support:

Works in Ultra Task Pipelines.

Limitations

Mixed content such as below is not supported. XML data with mixed content may contain attributes, elements, and text.

<letter>
       Dear Mr. <name>John Smith</name>.
       Your order <orderid>1032</orderid>
       will be shipped on <shipdate>2001-07-13</shipdate>.
</letter>

Account:

Accounts are not used with this Snap.

Views:

Input

This Snap has exactly one binary input view.

Output

This Snap has exactly one document output view.

Error

This Snap has at most one document error view and produces zero or more documents in the view. If the Snap fails during the operation, an error document is sent to the error view containing the fields error, reason, original, resolution, and stacktrace:

{ 
error: "Failed to convert xml to json"
reason: "Unexpected character ..."
original:  {...} 
resolution: "Please check if the xml data is well formed"
stacktrace: "java.lang.RuntimeException:  ...
}

Settings

Label

Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Inbound Schema

XSD schema definition file url for the incoming data. The currently supported url protocols are SLDB, HDFS, S3.

If you enter an Inbound schema, then you must select Validate XML and Match data types properties to derive the output as per the defined schema.

Example: sldb:///foo/bar/customer.xsd

Default value: [None]

Validate XML

Required. If selected, the incoming data will be validated against the provided XSD schema definition.

If you enter an Inbound schema, then you must select Validate XML and Match data types properties to derive the output as per the defined schema.

Example: False

Default value: False

Splitter

Splits the incoming XML document into multiple smaller documents using the XPath expression.

This expression must be of the form a/b/c/d or ns1:a/ns2:b/ns3:c/ns4:d where the prefixes ns1 to 4 can be the same or different.

Example 1: Splitter Expression without prefix

Default namespace can be accessed by giving a unique prefix in the splitter expression followed by a colon and the tag value. Provide its corresponding namespace value in the Prefix URI table.
To be on the safer side, make sure this prefix is not used in the XML before using it.

Splitter	breakfast_menu/food

If the XML data is of the form:

SnapLogic Documentation > XML Parser > XML Parser1(1).png

The output will be:

SnapLogic Documentation > XML Parser > XMLParser1_output.png

Example 2: Splitter Expression with prefix

For the Splitter expression: "d:catalog/d:book”, the output contains two output documents - one for each note tag in the XML file.
If the Splitter expression contains prefixes, they are to be defined in the Namespace Context so that the Snap can understand them.

Splitter	d:catalog/d:book

Prefix	URI
d	http://www.develop.com/student

SnapLogic Documentation > XML Parser > XMLParser2(1).png

In the Settings, enter d:catalog/d:book in the Splitter field and http://www.develop.com/student in URI field to get the output view containing the data with the prefix 'd'.

SnapLogic Documentation > XML Parser > XMLParser.jpg

The output view is:

SnapLogic Documentation > XML Parser > XMLParser2output.png

Default value: [None]

Namespace Context

optional

Namespace context for the expression provided in the Splitter property.

Namespaces are typically defined in the format of xmlns prefix:URI

Prefix

Prefixes included in the expression provided in the Splitter property.

URI

URIs associated with the prefixes.

Match data types

Specifies the values in the input document as specified in the Inbound schema property.

Supported design XSD files: Russian Doll
Supported data types: xs:string, xs:int, xs:integer, xs:long, xs:short, xs:byte, xs:float, xs:double, xs:decimal, and xs:boolean.
Salami Slice, Venetian Blind, and Garden of Eden design XSD files are not supported.
If you enter an Input schema, then you must select Validate XML and Match data types properties to derive the output as per the defined schema.

Optimization

Select the parameter that you want to optimize during Snap execution. Available options:

None: Continues with standard memory consumption and speed
Memory: Leads to lower memory consumption and slower execution
Speed: Leads to higher memory consumption and faster execution

Default value: None

Schema language supported : W3C XML Schema 1.0

Examples

Splitting XML

In this example, XML that contains multiple purchase orders is split into individual orders.

The incoming XML looks something like this:

<po:PurchaseOrders xmlns:po="http://www.example.com">
   <po:PurchaseOrder po:PurchaseOrderNumber="23578" po:OrderDate="2015-01-20">
     <po:Address po:Type="Shipping">
       <po:Name>Full Name</po:Name>
       <po:Street>123 Maple Street</po:Street>
       <po:City>City Name</po:City>
       <po:State>CA</po:State>
       <po:Zip>10101</po:Zip>
       <po:Country>USA</po:Country>
     </po:Address>
     <po:Address po:Type="Billing">
       <po:Name>Another Name</po:Name>
       <po:Street>456 Oak Avenue</po:Street>
       <po:City>Town Name</po:City>
       <po:State>NJ</po:State>
       <po:Zip>99999</po:Zip>
       <po:Country>USA</po:Country>
     </po:Address>
     <po:DeliveryNotes>Please leave packages on side porch.</po:DeliveryNotes>po
     <po:Items>
        <po:Item po:PartNumber="123456">
          <po:ProductName>Product</po:ProductName>
          <po:Quantity>1</po:Quantity>
          <po:USPrice>89.90</po:USPrice>
          <po:Comment>Refurbished</po:Comment>
        </po:Item>
     </po:Items>
   </po:PurchaseOrder>
   <po:PurchaseOrder po:PurchaseOrderNumber="23579" po:OrderDate="2015-01-20">
   ...
   </po:PurchaseOrder>
</po:PurchaseOrders>

Each PurchaseOrder contains the shipping address, billing address and the items purchased.

To split these into individual orders, the Splitter field should contain the hierarchy down to where you want the split to occur, including any specified prefix. In this case, the value is: po:PurchaseOrders/po:PurchaseOrder.

You will also need to specify the Namespace Context, which is defined in the sample as xmlns:po="http://www.example.com".

SnapLogic Documentation > XML Parser > XMLParser1.jpg

This will result in separating the orders into individual documents.

 [{, ...}, {, ...}, {, ...}] 
         {po:PurchaseOrder:{, ...}} 
         {po:PurchaseOrder:{, ...}} 
         {po:PurchaseOrder:{, ...}}