Introduction to Binning

The Binning stage performs what is known as unsupervised binning, which divides a continuous variable into groups (bins) without taking into account objective information. The data captured includes ranges, quantities, and percentage of values within each range.

Advantages to performing binning include the following:
  • It allows records with missing data to be included in the model.
  • It controls or mitigates the impact of outliers over the model.
  • It solves the issue of having different scales among the characteristics, making the weights of the coefficients in the final model comparable.

In Spectrumâ„¢ Technology Platform unsupervised binning, you can use equal-width bins, where the data is divided into bins of equal size, or equal-frequency bins, where the data is divided into groups containing approximately the same number of records. In the Binning stage, equal-width bins are referred to as Equal Range bins and equal-frequency bins are referred to as Equal Population bins.

You can view a list of binnings and delete binnings using command line instructions. See "Machine Learning Module" in the Administration Utility section of the Administration Guide.