On this page

Overview

Use the Mask Snap to hide sensitive information in your dataset before you export the dataset for analytics. You can protect sensitive data by using masking algorithms that the Snap provides out of the box.

Snap Input and Output

Input/Output	Type of View	Number of Views	Compatible Upstream and Downstream Snaps	Description
Input	Document	Min: 1 Max: 1	Mapper Snap MySQL - Select REST Get	A dataset where some of the data must be masked.
Output	Document	Min: 1 Max: 1	MySQL - Insert JSON Formatter CSV Formatter	A dataset where the specified data is masked.

Support for Ultra Pipelines

Works in Ultra Pipelines.

Snap Settings

Parameter Name	Data Type	Description	Default Value	Example
Label	String	Required. The name for the Snap. Modify this to be more specific, especially if there is more than one of the same Snap in the Pipeline.	N/A	Mask
Mask Policy	N/A	Required. Enables you to specify the policies that you want to use to mask data in the input dataset.	N/A	N/A
Field	String	The field/parent field in the input dataset that contains the data to be masked.	N/A	name
Search Mode	N/A	The mode the Snap must use to search sensitive data. Select from the following options: Exact Path: The Snap searches for the exact field name. Recursive: The Snap searches through all the levels in the nested structure of the specified Field. For example, if you want to mask `$credit_card` field, you must enter $credit_card in Field for Exact Path Search Mode. However, if you are using the Recursive option as the Search Mode, then you can just enter $ in Field.	N/A	Exact Path
Match Field	String	The type of field in the input data to be matched. Select from the following options: Key: Select if the data to be matched is the field name. Value: Select if the data to be matched is the field value. The Snap fails to validate if the Match Field is Key and the Mask Field is also Key. In such a case, select Value as the Mask Field.	N/A	Value
Match Condition	N/A	The match condition that determines whether the Match Field should be matched. For example, if your input dataset contains $credit_card Field and you enter the Match Field as Value and Match Condition as Number, then the Snap masks all $credit_card fields that contain a number as the value. If the field contains text as the value, then the Snap skips masking that value. Select from the following options: Regex Match: Matches the Key or Value when the key or value is the expression entered in Match Pattern. If Match Pattern is blank, the Snap matches all values. Credit Card: Matches the Key or Value when the key or value is the number pattern used in major credit cards. The supported credit card types are: Visa, MasterCard, JCB, American Express, Discover, and Diner's Club. The Snap supports both long digits or groups of 4-digits that are separated by a dash or space. For example, 4111 1111 1111 1111, 3589-0731-0185-9601, and 6011000000000004. SSN: Matches the Key or Value if the key or value is in the Social Security Number (SSN) pattern. The Snap supports values entered in the SSN format only, which is, XXX-YY-ZZZZ. You can use blanks or dashes as separators. For example, 123-12-1234 and 123 12 1234. Number: Matches the Value when the value is an integer or decimal. Number (Text): Matches the Key or Value when the key or value is a number in text format. The Snap does not support commas within the number text. Date: Matches the Value when the value is a date object. Date (Text): Matches the Key or Value when the key or value is a date in text or a pattern entered in Match Pattern. You can specify the format in Match Pattern. The Snap supports the following formats: "2018-08-12" "2018-08-12T12:34:56" "2018/08/12 12:34:56.780" "2018/08/12 12:34:56.78" "2018-08-12T12:34:56.78" "2018-08-12 12:34:56"	N/A	Date
Match Pattern	String	The expression of the information pattern to be matched in the input dataset. This is applicable only when the Match Condition is Regex Match or Date (Text).	N/A	Hello
Mask Field	N/A	The field that contains sensitive data and will be masked if the matching conditions are met. Select from the following options: Key: Select if the data to be masked is the key. Value: Select if the data to be masked is the value. The Snap fails to validate if the Match Field is Key and the Mask Field is also Key. In such a case, select Value as the Mask Field.	N/A	Value
Mask Method	N/A	The method to use to mask sensitive information. Select from the following options: Replace: Replaces the matched Value with the value you enter in Mask Value. Shuffle: Shuffles the matched data Value randomly. Remove: Removes the matched Key or Value. If the Mask Field is Value, the Snap returns null as the new value. If the Mask Field is Key, the Snap removes the whole Key-Value pair. With arrays and Recursive Search Mode, if the Mask Field is Value, the value is removed from the array. If all values are removed, the Snap returns an empty array. However, if the Mask Field is Key and the Snap removes the array along with the key. Random: Replaces the matched Value with a new random value. The Snap derives the random value based on the data type of the matched value and behaves as described below: Text: Randomly replaces the value with alphanumeric characters having the same length as the original data. Integer: Randomly replaces the value with an integer value in the range between 0 to an integer of the same digit as the actual data. The new value is always different from the original value. For example, if the original data is 120, the possible replacement value is between 0 to 999, except for 120. Decimal: Randomly replaces the value with a value that is based on the precision of the original value from 0 to the precision on the same digit. For example, if the original value is 0.023, the possible replacement value is between 0.000 to 0.099, except for 0.023. Boolean: Randomly replaces the value with either true or false. Start of Month: Replaces the matched Value with the first day of the month. If the value is a text, the Snap converts the text to date using the same way as the Auto mode in the Type Converter Snap. In this mode, the Snap automatically converts text to Date. Start of Year: Replaces the matched Value with the first day of the same year. If the value is a text, the Snap converts the text to date using the same way as the Auto mode in the Type Converter Snap. In this mode, the Snap automatically converts text to Date.	N/A	Remove
Mask Value	String Numeric	The value that must replace the sensitive information in the input dataset. You can enter either a fixed value or an expression. This is applicable only when the Mask Method is Replace.	N/A	0
Execute during preview	N/A	Specify the execution type from the following options: Validate & Execute: Performs limited execution of the Snap (up to 50 records) during Pipeline validation; performs full execution of the Snap (unlimited records) during Pipeline execution. Execute only: Performs full execution of the Snap during Pipeline execution; does not execute the Snap during Pipeline validation. Disabled: Disables the Snap and, by extension, the downstream Snaps.	Validate & Execute	Validate & Execute

Example

This Pipeline demonstrates how the Mask Snap helps you hide sensitive information in a dataset. In this example, the input dataset is a demographic of the Oscar award winners. We want to pass on the dataset to a third party to derive the analytics. However, before we pass on the data, we want to hide sensitive information from the dataset. We use the Mask Snap to identify and replace/remove all sensitive information from the Oscar award winners list. While the dataset here is public data, you can apply the masking policies that the Mask Snap provides to hide confidential or sensitive data from your organization.

Download the Pipeline.

The dataset is derived from Kaggle and is a demographic of Oscar award winners in the Best Director category from 1927 through 1976. The input document is picked up using the File Reader Snap and is passed through the CSV Parser Snap. A preview of the dataset in CSV format is per the following:

The dataset is passed to the Mask Snap to hide sensitive data. In this example, we want to mask the following fields:

date_of_birth: Replace all dates of birth with the first day of the respective year of birth
bio_url: Delete all HTTP/HTTPS URLs from the dataset
person: Replace the name of the winner with the text "Winner name is masked"

For this purpose, we have configured the Mask Snap per the following:

We have added three rows under Mask Policy, and each row enables us to mask one of the fields discussed above.

The first row under Mask Policy enables us to mask the date_of_birth field. We specify the Field as $date_of_birth and select the Search Mode as Exact Path. You can use Exact Path when you are sure that the field does not contain any data which is nested or in an array. The Match Field is Value and the Match Condition is Date (Text). The Mask Method is set to Start of Year. With this configuration, the Snap replaces all $date_of_birth values that are in standard date format with the first day of the year of birth. For example, in the first row of the dataset, the date of birth is 1895-09-30. The Snap replaces this value with 1895-01-01.

The second row under Mask Policy enables us to mask the bio_url field. We specify the Field as $ and select the Search Mode as Recursive. Recursive mode will search for all possible fields that could contain a URL that matches the regular expression, https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&\/\/=]*), in the Match Pattern. This regular expression matches all HTTP and HTTPS URLs in the dataset. In this case, the Match Field is Value, the Mask Field is Key, the Match Condition is Regex Match, and the Mask Method is Remove. The Snap completely deletes all keys that contain an HTTP/HTTPS value. Hence, we do not see the bio_url field in the output.

In Recursive mode, you can mask multiple fields at a time depending on the condition specified. For example, if you think your data is nested, or in an array, and that you cannot specify the exact path, you can use Recursive mode. In this example, we are not sure how many fields contain URLs. Hence, we use recursive mode to mask all field values that contain a URL.

The third row under Mask Policy enables us to mask the name of the winner. We specify the Field as $person and select the Search Mode as Exact Path. The Match Field is Key and the Match Condition is Regex Match, and the Match Pattern is .* which matches any value that is entered in the $person field. The Mask Method is Replace and the Mask Field is Value. The Mask Value is Winner name is masked. The Snap replaces all $person value with Winner name is masked. For example, in the first row of the dataset, the value Lewis Milestone is replaced with Winner name is masked. You can use this feature when you do not want to remove a value but replace it with static text to let the consumer of the dataset know that the original value is masked.

After the dataset is passed through the Mask Snap and the different mask conditions are applied to it, the output dataset is per the following screenshot:

In the output dataset preview, we can see that the date_of_birth field shows the first date of a year instead of the exact birth date, the bio_url field is removed, and the person field shows Winner name is masked instead of the real name.

The masked dataset output of the Mask Snap is then converted to CSV format using the CSV Formatter Snap and then passed to a File Writer Snap.

Snap Pack History

Click to view/expand

Release	Snap Pack Version	Date	Type	Updates
February 2024	436patches25781	03 Apr 2024	Latest	Enhanced the Deduplicate Snap to honor an `interrupt while waiting in the delay loop` to manage the memory efficiently.
February 2024	main25112	14 Feb 2024	Stable	Updated and certified against the current SnapLogic Platform release.
November 2023	main23721	Nov 8, 2023	Stable	Updated and certified against the current SnapLogic Platform release.
August 2023	main22460	Aug 16, 2023	Stable	Updated and certified against the current SnapLogic Platform release.
May 2023	433patches21572	20 Jun 2023	Latest	The Deduplicate Snap now manages memory efficiently and eliminates out-of-memory crashes using the following fields: Minimum memory (MB) Minimum free disk space (MB)
May 2023	433patches21247	31 May 2023	Latest	Fixed an issue with the Match Snap where a null pointer exception was thrown when the second input view had fewer records than the first.
May 2023	main21015	10 May 2023	Stable	Upgraded with the latest SnapLogic Platform release.
February 2023	main19844	09 Feb 2023	Stable	Upgraded with the latest SnapLogic Platform release.
December 2022	431patches19268	19 Dec 2022	Latest	The Deduplicate Snap now ignores fields with empty strings and whitespaces as no data.
November 2022	main18944	10 Nov 2022	Stable	Upgraded with the latest SnapLogic Platform release.
August 2022	main17386	11 Aug 2022	Stable	Upgraded with the latest SnapLogic Platform release.
4.29	main15993	14 May 2022	Stable	Upgraded with the latest SnapLogic Platform release.
4.28	main14627	20 Jul 2022	Stable	Enhanced the Type Converter Snap with the Fail safe upon execution checkbox. Select this checkbox to enable the Snap to convert data with valid data types, while ignoring invalid data types.
4.27	427patches13730			Enhanced the Type Converter Snap with the Fail safe upon execution checkbox. Select this checkbox to enable the Snap to ignore invalid data types and convert data with valid data types.
4.27	427patches13948	07 Jan 2022	Latest	Fixed an issue with the Principal Component Analysis Snap, where a deadlock occurred when data is loaded from both the input views.
4.27	main12833	13 Nov 2021	Stable	Upgraded with the latest SnapLogic Platform release.
4.26	main11181	14 Aug 2021	Stable	Upgraded with the latest SnapLogic Platform release.
4.25	425patches10994	04 Aug 2021		Fixed an issue when the Deduplicate Snap where the Snap breaks when running on a locale that does not format decimals with Period (.) character.
4.25	main9554	08 May 2021	Stable	Upgraded with the latest SnapLogic Platform release.
4.24	main8556	13 Feb 2021	Stable	Upgraded with the latest SnapLogic Platform release.
4.23	main7430	14 Nov 2020	Stable	Upgraded with the latest SnapLogic Platform release.
4.22	main6403	12 Sep 2020	Stable	Upgraded with the latest SnapLogic Platform release.
4.21	snapsmrc542	09 May 2020	Stable	Introduces the Mask Snap that enables you to hide sensitive information in your dataset before exporting the dataset for analytics or writing the dataset to a target file. Enhances the Match Snap to add a new field, Match all, which matches one record from the first input with multiple records in the second input. Also, enhances the Comparator field in the Snap by adding one more option, Exact, which identifies and classifies a match as either an exact match or not a match at all. Enhances the Deduplicate Snap to add a new field, Group ID, which includes the Group ID for each record in the output. Also, enhances the Comparator field in the Snap by adding one more option, Exact, which identifies and classifies a match as either an exact match or not a match at all. Enhances the Sample Snap by adding a second output view which displays data that is not in the first output. Also, a new algorithm type, Linear Split, which enables you to split the dataset based on the pass-through percentage.
4.20 Patch	mldatapreparation8771	18 Mar 2020	Latest	Removes the unused `jcc-optional` dependency from the ML Data Preparation Snap Pack.
4.20	snapsmrc535	08 Feb 2020	Stable	Upgraded with the latest SnapLogic Platform release.
4.19	snapsmrc528	14 Nov 2019	Stable	New Snap: Introducing the Deduplicate Snap. Use this Snap to remove duplicate records from input documents. When you use multiple matching criteria to deduplicate your data, it is evaluated using each criterion separately, and then aggregated to give the final result.
4.18	snapsmrc523	10 Aug 2019	Stable	Upgraded with the latest SnapLogic Platform release.
4.17 Patch	ALL7402	11 Jun 2019	Latest	Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.
4.17	snapsmrc515	11 Jun 2019	Latest	New Snap: Introducing the Feature Synthesis Snap, which automatically creates features out of multiple datasets that share a one-to-one or one-to-many relationship with each other. New Snap: Introducing the Match Snap, which enables you to automatically identify matched records across datasets that do not have a common key field. Added the Snap Execution field to all Standard-mode Snaps. In some Snaps, this field replaces the existing Execute during preview check box.
4.16	snapsmrc508	16 Feb 2019	Stable	Added a new Snap, Principal Component Analysis, which enables you to perform principal component analysis (PCA) on numeric fields (columns) to reduce dimensions of the dataset.
4.15	snapsmrc500	15 Dec 2018	Stable	New Snap Pack. Perform preparatory operations on datasets such as data type transformation, data cleanup, sampling, shuffling, and scaling. Snaps in this Snap Pack are: Categorical to Numeric Clean Missing Values Date Time Extractor Numeric to Categorical Sample Scale Shuffle Type Converter

SnapLogic Documentation

Mask