ML Data Preparation Snap Pack

ML Data Preparation Snap Pack is a part of SnapLogic Data Science (Machine Learning) Snaps. The Snaps in this Snap Pack are useful in preparing the data upon which machine learning operations are to be performed. Use Snaps in this Snap Pack to:

  • Convert categorical data to numeric and vise versa.
  • Extract datetime components. 
  • Scale/transform data.
  • Convert datatypes.
  • Generate samples from a dataset.
  • Randomly shuffle the order of documents in a dataset.
  • Handle missing values in a dataset.
  • Perform Principal Component Analysis (PCA) on an input document.
  • Create features out of multiple datasets that share a one-to-one or one-to-many relationship with each other.
  • Identify matched records across datasets.
  • Mask sensitive information in your dataset before exporting the dataset for analytics.

Snap Pack History

 Click to view/expand

4.27 (main12833)

  • No updates made.

4.26 (main11181)

  • No updates made.

4.25 (425patches10994)

  • Fixed an issue when the Deduplicate Snap where the Snap breaks when running on a locale that does not format decimals with Period (.) character. 

4.25 (main9554)

  • No updates made.

4.24 (main8556)

  • No updates made.

4.23 (main7430)

  • No updates made.

4.22 (main6403)

  • No updates made.

4.21 (snapsmrc542)

  • Introduces the Mask Snap that enables you to hide sensitive information in your dataset before exporting the dataset for analytics or writing the dataset to a target file.
  • Enhances the Match Snap to add a new field, Match all, which matches one record from the first input with multiple records in the second input. Also, enhances the Comparator field in the Snap by adding one more option, Exact, which identifies and classifies a match as either an exact match or not a match at all.
  • Enhances the Deduplicate Snap to add a new field, Group ID, which includes the Group ID for each record in the output. Also, enhances the Comparator field in the Snap by adding one more option, Exact, which identifies and classifies a match as either an exact match or not a match at all.
  • Enhances the Sample Snap by adding a second output view which displays data that is not in the first output. Also, a new algorithm type, Linear Split, which enables you to split the dataset based on the pass-through percentage.

4.20 Patch mldatapreparation8771

  • Removes the unused jcc-optional dependency from the ML Data Preparation Snap Pack.

4.20 (snapsmrc535)

  • No updates made.

4.19 (snapsmrc528)

  • New Snap: Introducing the Deduplicate Snap. Use this Snap to remove duplicate records from input documents. When you use multiple matching criteria to deduplicate your data, it is evaluated using each criterion separately, and then aggregated to give the final result.

4.18 (snapsmrc523)

  • No updates made.

4.17 Patch ALL7402

  • Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.

4.17 (snapsmrc515)

  • New Snap: Introducing the Feature Synthesis Snap, which automatically creates features out of multiple datasets that share a one-to-one or one-to-many relationship with each other.
  • New Snap: Introducing the Match Snap, which enables you to automatically identify matched records across datasets that do not have a common key field.
  • Added the Snap Execution field to all Standard-mode Snaps. In some Snaps, this field replaces the existing Execute during preview check box.

4.16 (snapsmrc508)

  • Added a new Snap, Principal Component Analysis, which enables you to perform principal component analysis (PCA) on numeric fields (columns) to reduce dimensions of the dataset.

4.15 (snapsmrc500)

  • New Snap Pack. Perform preparatory operations on datasets such as data type transformation, data cleanup, sampling, shuffling, and scaling. Snaps in this Snap Pack are: 
    • Categorical to Numeric
    • Clean Missing Values
    • Date Time Extractor
    • Numeric to Categorical
    • Sample
    • Scale
    • Shuffle
    • Type Converter