Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

ML Data Preparation Snap Pack is a part of SnapLogic Data Science (Machine Learning) Snaps. The Snaps in this Snap Pack are useful in preparing the data upon which machine learning operations are to be performed. Use Snaps in this Snap Pack to:

  • Convert categorical data to numeric and vise versa.
  • Extract datetime components. 
  • Scale/transform data.
  • Convert datatypes.
  • Generate samples from a dataset.
  • Randomly shuffle the order of documents in a dataset.
  • Handle missing values in a dataset.
  • Perform Principal Component Analysis (PCA) on an input document.
  • Create features out of multiple datasets that share a one-to-one or one-to-many relationship with each other.
  • Identify matched records across datasets.
  • Mask sensitive information in your dataset before exporting the dataset for analytics.


Panel
bgColor#ebf7e1
borderStylesolid

In this Section

Child pages (Children Display)
alltrue
depth2



Excerpt

Snap Pack History

Expand
titleClick to view/expand

4.21 (snapsmrc542)

  • Introduces the Mask Snap that enables you to hide sensitive information in your dataset before exporting the dataset for analytics or writing the dataset to a target file.

4.20 Patch mldatapreparation8771

  • Removes the unused jcc-optional dependency from the ML Data Preparation Snap Pack.

4.20 (snapsmrc535)

  • No updates made.

4.19 (snapsmrc528)

  • New Snap: Introducing the Deduplicate Snap. Use this Snap to remove duplicate records from input documents. When you use multiple matching criteria to deduplicate your data, it is evaluated using each criterion separately, and then aggregated to give the final result.

4.18 (snapsmrc523)

  • No updates made.

4.17 Patch ALL7402

  • Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.

4.17 (snapsmrc515)

  • New Snap: Introducing the Feature Synthesis Snap, which automatically creates features out of multiple datasets that share a one-to-one or one-to-many relationship with each other.
  • New Snap: Introducing the Match Snap, which enables you to automatically identify matched records across datasets that do not have a common key field.
  • Added the Snap Execution field to all Standard-mode Snaps. In some Snaps, this field replaces the existing Execute during preview check box.

4.16 (snapsmrc508)

  • Added a new Snap, Principal Component Analysis, which enables you to perform principal component analysis (PCA) on numeric fields (columns) to reduce dimensions of the dataset.

4.15 (snapsmrc500)

  • New Snap Pack. Perform preparatory operations on datasets such as data type transformation, data cleanup, sampling, shuffling, and scaling. Snaps in this Snap Pack are: 
    • Categorical to Numeric
    • Clean Missing Values
    • Date Time Extractor
    • Numeric to Categorical
    • Sample
    • Scale
    • Shuffle
    • Type Converter