$customHeader
Skip to end of banner
Go to start of banner

Understanding the Mapping Root

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Documents in a pipeline can be hierarchical, meaning an object can contain other objects or arrays, which themselves can contain objects or arrays.  For example, the following JSON document is hierarchical since the root object contains an object in the "child" field:

  {
    "name": "Acme",
    "child": { "field1": 1, "field2": 2 } }

Mapping simple hierarchical documents that only contain other objects is straightforward since you can directly map one field to another. However, performing a mapping for documents that contain arrays of objects is more complicated since the objects in the array need to be mapped separately from the parent object. The mapping needs to be separate because there is no unambiguous way to describe the array mapping using the expression language and JSON-Paths. To address the need to map arrays, the Mapping Root property has been added to the Mapper Snap. 

The Mapping Root property is a JSONPath that limits the scope of a mapping to the parts of the document that match the given path. For example, a Mapping Root like $.my_array[*] will tell the Mapper to iterate over the objects in the array and transform each object based on the mapping. The other parts of the document that do not match the Mapping Root will be passed through untouched in the output. By default, the root is set to $, which is the root of the document. 

Since array mappings need to be done separately, you will need to add additional Mapper Snaps for each array mapping that needs to be done. The additional Mapper Snaps should be chained together such that the top levels of the hierarchy are mapped before descending down to the lower levels. The reason for this ordering is that the Mapper UI will pare down the Input and Target schema views to only show the fields that are in the objects of the array.

Therefore, the outer structures of the document need to agree between the source and target, else the schema views are not useful. 

As a more complete example, we will build a pipeline that maps the following source document to a target document.

Source document:

{
    "name": "Acme",
    "employee": [ { "first_name": "Bob", "last_name": "Smith", "age": 32 }, { "first_name": "Joe", "last_name": "Doe", "age": 44 } ] }
 
 

Target document:

  {
    "company_name": "Acme",
    "workers": [ { "name": "Bob Smith", "age": 32 }, { "name": "Joe Doe", "age": 44 } ] }

The source document is hierarchical since it contains an array of objects, so you need two separate Mapper Snaps: one to map the parent fields and another to map the elements in the "employee" array.  The first Mapper's configuration is simple since it is just changing names:  

SourceTarget
$name$company_name
$employee$workers

The second Mapper is connected to the output of the first, so that it can work on the lower levels of the document hierarchy. The "Mapping Root" for this Snap will need to be changed, so that only the objects in the "workers" array will be impacted by the mapping transformations. After setting the root, note that the Input schema changes to only show the fields in the array objects. If there is a target schema available, it is narrowed down to show the "name" and "age" fields.   

Mapping Root: $workers[*]

SourceTarget
$first_name + " " + $last_name$name
$age$age
  • No labels