On this Page
title | Breaking change for Pipelines using the XML Formatter Snap |
---|
In this article
Table of Contents | ||||
---|---|---|---|---|
|
Overview
You can use this Snap to format the incoming document objects into XML data format.
Note | ||
---|---|---|
| ||
After upgrading to 425patches10152, the Snap might fail when you select Map input to first repeating element in XSD checkbox with an error Couldcheckbox with an Could not convert the document to an XMLXML error. In such cases, ensure that you provideyou provide a complete path from the root element for an object and map the input properly to the root element. |
Table of Contents | ||||
---|---|---|---|---|
|
Snap type:
Format
Description:
Snap Type
This Snap is a FORMAT-type Snap that formats incoming data objects into XML data format.
The conversion from the internal JSON structure to XML follows the StAXON mapping convention. This mapping uses naming conventions to turn JSON object fields into XML nodes and attributes. For example, using the default Snap settings, the following two documents will be collected and written out as a single chunk of XML:Code Block |
---|
{
"msg" : { "@attr1" : "0", "$" : "Hello, World!" } }
{
"msg" : "Goodbye, World!" } |
The output of the formatter for these two documents will look something like the following. Note that the "msg" fields are turned into nodes in the XML structure, fields that start with "@" are turned into attributes on those nodes and "$" fields are turned into text nodes. The "DocumentRoot", "Document", and related fields are part of the wrappers added by the formatter when it collects multiple documents into a single file.
Code Block |
---|
<?xml version="1.0" encoding="UTF-8"?>
<DocumentRoot>
<Document>
<Metadata>
<asMap>
<global>
<doc_id>307db138-0ec4-11e4-a996-49c25019fb1e</doc_id>
</global>
</asMap>
<id>307db138-0ec4-11e4-a996-49c25019fb1e</id>
</Metadata>
<Data>
<msg attr1="0">Hello, World!</msg>
</Data>
</Document>
<Document>
<Metadata>
<asMap>
<global>
<doc_id>307dff59-0ec4-11e4-a996-49c25019fb1e</doc_id>
</global>
</asMap>
<id>307dff59-0ec4-11e4-a996-49c25019fb1e</id>
</Metadata>
<Data>
<msg>Goodbye, World!</msg>
</Data>
</Document>
</DocumentRoot> |
Note | ||
---|---|---|
If the value associated with a key is null (a valid JSON value), then an empty string will be placed as the text in between XML elements. For example,
will be transformed into <msg attr1="0"/>. |
Code Block |
---|
<?xml version="1.0" encoding="UTF-8"?>
<msg attr1="0">Hello, World!</msg>
<?xml version="1.0" encoding="UTF-8"?>
<msg>Goodbye, World!</msg> |
If you would like generate a file without the wrapper elements, you can clear the "Root Element" property and the Snap will produce a single binary document for every input document, like so:
The Snap can also validate documents against a provided schema to ensure that the output is well-formed. Any invalid documents will be written to the error output view.
The Snap by default produces XML that is not canonical. You can transform the XML output to canonical XML by selecting the Format as Canonical XML checkbox. The canonical form of an XML output from XML Formatter Snap is an XML representation that excludes the XML prolog and includes the start and end tag for all the XML elements. For more information about Canonical XML Version 1.0 specifications, see https://www.w3.org/TR/xml-c14n.
[None]
Works in Ultra Task Pipelines if Root Element is not selected.
Accounts are not used with this Snap.
Input | This Snap has exactly one document input view. |
---|---|
Output | This Snap has exactly one document output view. |
Error | This Snap has at most one document error view and produces zero or more documents in the view. |
Settings
Label
Outbound schema
XSD schema definition file URL for the outgoing data. The currently supported url protocols are SLDB, HDFS, S3.
Example: sldb:///foo/bar/customer.xsd
Default value: [None]Required. Specifies the incoming data to be validated against the provided XSD schema definition in Inbound Schema property.
Note |
---|
If you enter an Inbound schema, then you must select Validate XML and Match data types properties to derive the output as per the defined schema. |
Example: False
Default value: False
The name of the XML element under which all the incoming documents will be added. If empty, no wrapping will be done and each document will be written out as a separate binary.
Example: DocumentRoot
Default value: DocumentRoot
If selected, the Snap will do nothing when no document has been received in the input view. If not selected, the Snap will write an empty XML data with a header at the output view when no document has been received in the input view.
Default value: Not selected
Map input to first repeating element in XSD
If selected, the Snap ignores the root element from the XSD file. See the example below for more information.
Default value: Not selected |
If selected, the Snap produces a canonical XML output.
The canonical form of an XML output (from XML Formatter Snap) is an XML representation that excludes the XML prolog and includes the start and end tag for all the XML elements. For more information about Canonical XML Version 1.0 specifications, see https://www.w3.org/TR/xml-c14n.
Default value: Not selected
This configuration limits the number of schema levels provided by the outbound schema on the SnapLogic input view to a maximum of 10 levels. This configuration does not effect the validation capability. It serves to prevent outbound schemas from consuming too much memory and processing time.
Default value: 10
Multiexcerpt include macro | ||||
---|---|---|---|---|
|
Multiexcerpt include macro | ||||
---|---|---|---|---|
|
Schema language supported : W3C XML Schema 1.0
TroubleshootingPrerequisites
None.
Support for Ultra Pipelines
Works in Ultra Task Pipelines if Root Element is not selected.
Limitations and Known Issues
None.
Snap Views
Type | Format | Number of Views | Examples of Upstream and Downstream Snaps | Description |
---|---|---|---|---|
Input | Document |
|
| JSON document. |
Output | Binary |
|
| XML document |
Error | Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:
Learn more about Error handling in Pipelines. |
Snap Settings
Info |
---|
|
Field Name | Field Type | Description | ||||
---|---|---|---|---|---|---|
Label* Default Value: XML Formatter | String | Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline. | ||||
Outbound schema Default value: N/A | String/Expression | Specify the XSD schema definition file URL for the outgoing data. The currently supported URL protocols are SLDB, HDFS, S3. The Schema language supported is W3C XML Schema 1.0. | ||||
Output Character Set*
| String/Expression/Suggestion | Specify or select the character set encoding for the output. | ||||
Validate output Default value: Deselected | Checkbox | Select this checkbox to validate the output against the provided XSD schema definition in Outbound Schema field.
| ||||
Root element Default value: None | String | Specify the name of the XML element under which all the incoming documents are added. If left blank, no wrapping is done and each document is written out as a separate binary. | ||||
Ignore empty stream Default value: Not selected | Checkbox | Select this checkbox to enable the Snap to ignore when no document is received in the input view. | ||||
Map input to first repeating element in XSD Default value: Not selected | Checkbox | Select this checkbox if you want the Snap to ignore the root element from the XSD file. See the example below for more information.
| ||||
Format as Canonical XML Default value: Not selected | Checkbox | Select this checkbox to transform the XML output to canonical XML. The canonical form of an XML output (from XML Formatter Snap) is an XML representation that excludes the XML prolog and includes the start and end tag for all the XML elements. Learn more about Canonical XML Version 1.0 specifications: https://www.w3.org/TR/xml-c14n. | ||||
Max schema levels Default value: 10 | String | Specify the max levels for schema to be processed by the Snap. This configuration limits the number of schema levels provided by the outbound schema on the SnapLogic input view to a maximum of 10 levels. This configuration does not effect the validation capability. It serves to prevent outbound schemas from consuming too much memory and processing time. | ||||
Snap Execution Default Value: Validate & Execute | Dropdown list | Select one of the three modes in which the Snap executes. Available options are:
|
Additional Information
The conversion from the internal JSON structure to XML follows the StAXON mapping convention. This mapping uses naming conventions to turn JSON object fields into XML nodes and attributes.
For example, using the default Snap settings, the following two documents will be processed and written out as a single chunk of XML:
Code Block |
---|
{
"msg" : { "@attr1" : "0", "$" : "Hello, World!" } }
{
"msg" : "Goodbye, World!" } |
The output of the formatter for these two documents is as follows.
Code Block |
---|
<?xml version="1.0" encoding="UTF-8"?>
<DocumentRoot>
<Document>
<Metadata>
<asMap>
<global>
<doc_id>307db138-0ec4-11e4-a996-49c25019fb1e</doc_id>
</global>
</asMap>
<id>307db138-0ec4-11e4-a996-49c25019fb1e</id>
</Metadata>
<Data>
<msg attr1="0">Hello, World!</msg>
</Data>
</Document>
<Document>
<Metadata>
<asMap>
<global>
<doc_id>307dff59-0ec4-11e4-a996-49c25019fb1e</doc_id>
</global>
</asMap>
<id>307dff59-0ec4-11e4-a996-49c25019fb1e</id>
</Metadata>
<Data>
<msg>Goodbye, World!</msg>
</Data>
</Document>
</DocumentRoot> |
Note | ||
---|---|---|
If the value associated with a key is null (a valid JSON value), then an empty string will be placed as the text in between XML elements. For example,
will be transformed into <msg attr1="0"/>. |
Code Block |
---|
<?xml version="1.0" encoding="UTF-8"?>
<msg attr1="0">Hello, World!</msg>
<?xml version="1.0" encoding="UTF-8"?>
<msg>Goodbye, World!</msg> |
If you want to generate a file without the wrapper elements, you can remove the "Root Element" property and the Snap produces a single binary document for every input document. The Snap also validates documents against a provided schema to ensure that the output is well-formed. Any invalid documents will be written to the error output view.
XML Formatter and XML Generator Output Differs for the Same XSD File Input
The XML Formatter and XML Generator Snaps work differently for an XSD file as the input document. To generate the same output from both the Snaps, append the following to the required schema in the Target path property of the preceding Mapper Snap:
- XML Formatter:
.$
- XML Generator:
.value
Examples
In the following example, the Map input to first repeating element in XSD field is set to Trueselected.
Assume the given XSD file is:
Code Block |
---|
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="urn:books" xmlns:bks="urn:books"> <xsd:element name="books" type="bks:BooksForm"/> <xsd:complexType name="BooksForm"> <xsd:sequence> <xsd:element name="book" type="bks:BookForm" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="BookForm"> <xsd:sequence> <xsd:element name="author" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="title" type="xsd:string"/> <xsd:element name="genre" type="xsd:string"/> <xsd:element name="price" type="xsd:float" /> <xsd:element name="pub_date" type="xsd:date" /> <xsd:element name="review" type="xsd:string"/> </xsd:sequence> <xsd:attribute name="id" type="xsd:string"/> </xsd:complexType> </xsd:schema> |
Assume Suppose the following are the two input documents are:
Code Block |
---|
[ { "@id": "bk001", "title": "The First Book", "author": "John Smith", "genre": "Fiction", "price": "44.95", "pub_date": "2000-10-01", "review": "An amazing story of nothing." }, { "@id": "bk002", "title": "The Poet's First Poem", "author": "Mary Parker", "genre": "Poem", "price": "24.95", "pub_date": "2000-10-01", "review": "Least poetic poems." } ] |
The Snap extracts "books" as the root element name and "book" as the wrapper name from the XSD file.
The following is the default XML output will be:
Code Block |
---|
<?xml version='1.0' encoding='UTF-8'?> <books> <book id="bk001"> <title>The First Book</title> <author>John Smith</author> <genre>Fiction</genre> <price>44.95</price> <pub_date>2000-10-01</pub_date> <review>An amazing story of nothing.</review> </book> <book id="bk002"> <title>The Poet's First Poem</title> <author>Mary Parker</author> <genre>Poem</genre> <price>24.95</price> <pub_date>2000-10-01</pub_date> <review>Least poetic poems.</review> </book> </books> |
The following is the canonical XML output will be:
Code Block |
---|
<books> <book id="bk001"> <title>The First Book</title> <author>John Smith</author> <genre>Fiction</genre> <price>44.95</price> <pub_date>2000-10-01</pub_date> <review>An amazing story of nothing.</review> </book> <book id="bk002"> <title>The Poet's First Poem</title> <author>Mary Parker</author> <genre>Poem</genre> <price>24.95</price> <pub_date>2000-10-01</pub_date> <review>Least poetic poems.</review> </book> </books> |
Insert excerpt | ||||||
---|---|---|---|---|---|---|
|
...