Understanding the Mapping Root

Documents in a pipeline can be hierarchical, meaning an object can contain other objects or arrays, which themselves can contain objects or arrays.  For example, the following JSON document is hierarchical since the root object contains an object in the "child" field:

  {
    "name": "Acme",
    "child": { "field1": 1, "field2": 2 } }

Mapping simple hierarchical documents that only contain other objects is straightforward since you can directly map one field to another. However, mapping for documents that contain arrays of objects is more complicated because the objects in the array must be mapped separately from the parent object. The mapping is separate because there is no unambiguous way to describe the array mapping using the expression language and JSONPaths. To map arrays, the Mapping Root property is included in the Mapper Snap. 

The Mapping Root property is a JSONPath that limits the scope of a mapping to the parts of the document that match the given path. For example, a Mapping Root like $.my_array[*] enables the Mapper to iterate over the objects in the array and transform each object based on the mapping. The other parts of the document that do not match the Mapping Root are passed through untouched in the output. By default, the root is set to $, which is the root of the document. 

To move any variables inside $original to root level, you must map the variables to root variables. For example, you must map $original.salary to $salary and $original.balance to $balance.

Because array mappings must be done separately, add additional Mapper Snaps for each array mapping. Chain additional Mapper Snaps together such that the top levels of the hierarchy are mapped before descending down to the lower levels. The reason for this ordering is that the Mapper UI pares down the Input and Target schema views to only show the fields that are in the objects of the array.

The outer structures of the document need to agree between the source and target; otherwise, the schema views are not useful. 

As a more complete example, we built a pipeline that maps the following source document to a target document.

Source document:

{
    "name": "Acme",
    "employee": [ { "first_name": "Bob", "last_name": "Smith", "age": 32 }, { "first_name": "Joe", "last_name": "Doe", "age": 44 } ] }
 
 

Target document:

  {
    "company_name": "Acme",
    "workers": [ { "name": "Bob Smith", "age": 32 }, { "name": "Joe Doe", "age": 44 } ] }

The source document is hierarchical because it contains an array of objects. You need two separate Mapper Snaps: one to map the parent fields and another to map the elements in the "employee" array.  The first Mapper's configuration is simple because it is just changes names:  

SourceTarget
$name$company_name
$employee$workers

The second Mapper is connected to the output of the first, so that it works on the lower levels of the document hierarchy. The "Mapping Root" for this Snap must be changed, so that only the objects in the "workers" array will be impacted by the mapping transformations. After setting the root, the Input schema changes to only show the fields in the array objects. If there is a target schema available, it is narrowed down to show the "name" and "age" fields.  

Mapping Root: $workers[*]

SourceTarget
$first_name + " " + $last_name$name
$age$age