SnapLogic’s Using Data Catalog Services provides the capability to store metadata from Pipelines into Table Assets and the functionality to query information about the actual source data without having to know where it is stored.
Because data format, location, and schema can change over time as new data is passed through Pipelines, the Table Asset enables you to organize your metadata in the SnapLogic UI. This Asset serves as a logical abstraction for the data stored on external storage systems, such as AWS S3. Since your source tables are divided into partitions, you can enter those partition keys into the Table Asset. This capability allows users who need to read from the data lake to work with the Table abstraction, without having to know about the physical storage location of data files. You can store metadata from your Pipelines by entering partition keys as values in the Catalog Snaps, along with the schema of the data drawn from the source data. The Catalog Insert and Catalog Writer Snaps populate a Table with partitions, while the Catalog Delete Snap deletes them. You can create, query, and delete Tables directly in SnapLogic Manager. Additionally, use tools such as Spark to process metadata when running eXtreme Pipelines.
Creating a Table in SnapLogic Manager
To create a new Table Asset directly in SnapLogic Manager:
Go to the Project folder in which you want to create the Table Asset.
Click the icon on the right side of the menu bar, and click Table; the Create Table form is displayed.
Enter the name of the Table in the Label field and the names of the Partition Keys. If there are multiple partition keys, enter each of them in a separate row.
Click Save. The Table is created and can be viewed from the Tables tab.
Querying Metadata and Schema Information in Table Partition
To query partitions in a Table Asset:
In SnapLogic Manager, go to the Project where the Table Asset resides, then click the target Table. If you have several assets in the project, you can either search for the table or click the Table tab to filter out the tables and select the target table from there.
Select a Partition Key from the Partition drop-down menu.
Select an Operator from the drop-down menu (=, !=, =>, >,>=,<, <=,like), which are identical to those in the Catalog Query Snap.
Enter the keyword in the Search Term field and click Go. The matching partitions are displayed. If you click Go when the fields under Search Partitions are empty, then all Partitions are displayed.
If you want to search for a string composed of only numbers, then include the numbers in the Search field within double quotation marks.
To display schema information for the entire Table, click Show Table Schema and a union of all schemas is displayed.Alternatively, you can upload a new schema for the Table through the displayed pop-up.
For further analysis, select a partition:
To browse or search for Key and Value pairs, click the icon.
To search for or filter out specific names and attributes in the Schema, click the icon.
To delete the partition, click the icon.
Deleting a Table Asset in SnapLogic Manager
You can only delete a Table Asset in SnapLogic Manager.
To delete a Table:
Go to the Project folder in which the target Table Asset resides, and click the Table tab.
Select the target Table, then click the icon. A pop-up prompts you to confirm the deletion.