Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

LabelRequired. The name for the Snap. You can modify this to be specific, especially if you have more than one of the same Snap in your pipeline.
Algorithm

Required. The clustering algorithm that must be used to cluster the data into specific groups. The available options are:

  • K-Means: Partitions n observations into k clusters in which each observation belongs to the cluster with the nearest mean.
  • X-Means: An extended K-Means which tries to automatically determine the number of clusters based on Bayesian Information Criterion (BIC) scores.
  • G-Means: Another extended K-Means which tries to automatically determine the number of clusters by normality test.
Info

For a detailed description of the algorithms, read here.

Default value: K-Means

Max cluster

Required. The maximum number of clusters that the Snap must create. 

Default value: 3

Minimum: 2

Maximum: 10000

Note

If you select the Algorithm as K-Means, the Snap creates the exact number of clusters that you specify here. For X-Means and G-Means algorithms, the Snap performs an automatic optimization on your dataset and the number of clusters might be equal to or less than the value you specify here.


Pass through

Select to include all input fields in the output. Else, the Snap outputs only the cluster index.

Default value: Selected

Snap Execution

The Snap execution mode. The available options are:

  • Validate & Execute: Executes the Pipeline during execution and validation.
  • Execute onlyExecutes the Pipeline during execution only, and not during validation.
  • Disabled: Does not execute the Pipeline during execution or validation

Default value: Validate & Execute

...