Introduction

The Binning stage performs what is known as unsupervised binning, which divides a continuous variable into groups (bins) without taking into account objective information. The data captured includes ranges, quantities, and percentage of values within each range.

Advantages to performing binning include the following:

  • It allows records with missing data to be included in the model.
  • It controls or mitigates the impact of outliers over the model.
  • It solves the issue of having different scales among the characteristics, making the weights of the coefficients in the final model comparable.

In Spectrum Technology Platform unsupervised binning, you can use equal-width bins, where the data is divided into bins of equal size, or equal-frequency bins, where the data is divided into groups containing approximately the same number of records. In the Binning stage, equal-width bins are referred to as Equal Range bins and equal-frequency bins are referred to as Equal Population bins.

You can perform more binning functions using the Machine Learning Model Management Binning Management tool.

You can also view a list of binning and delete binning using command line instructions. See "Binning" in the "Administration Utility" section of the Administration Guide.