Clustering
A PMML clustering model determines the best matching cluster for a given record based on the distance or similarity measure used for clustering. A cluster is a subset of similar data. Clustering (also called unsupervised learning) is the process of dividing a dataset into groups such that the members of each group are as similar to each other as possible and different groups are as dissimilar from each other as possible.
Model Element
<ClusteringModel functionName="clustering" ...
Unsupported Features
Clustering models with the <MiningSchema> element containing a reference to a <DerivedField> element are not supported.
Model Outputs
Supported Model Output Features | Description |
---|---|
predictedValue | The best matching cluster based on the distance or similarity measure used for clustering. |
transformedValue | A value generated via a transformation expression applied to the predicted model output. |
decision | A value generated via an expression applied to the predicted model output resulting in a categorized value. |
predictedDisplayValue | The human readable value used to represent the predicted value from the model. |
entityId | If present, the 1-based index (implicit identifier) of the winning/predicted cluster. |
affinity | The value of the distance or the similarity of the provided record to the predicted cluster as defined in the model. |