Supervised Learning: Loan Default Prediction

Download the supervised learning demonstration

The Data Science supervised learning demonstration conducts loan default prediction using Lending Club data. It utilizes several files that together demonstrate the functionality of the Spectrum Technology Platform Data Science Solution in Enterprise Designer.

Spectrum_DataScience_Supervised_Learning.zip includes the following files:

  • Spectrum_DataScience_Supervised_Learning.pdf—Documentation that walks you through how to build and use the single categorizer dataflow, the scoring dataflow, and all supporting files.
  • Data.zip—The required input files, test files, and training files for each of the included dataflows.
    • loan.csv
    • LoanStats_2016Q1.csv
    • LoanStats_2016Q2.csv
    • LoanStats_2016Q3.csv
    • testData.txt
    • testDataCollege.txt
    • testDataStable.txt
    • testDataThankful.txt
    • trainData.txt
    • trainDataCollege.txt
    • trainDataStable.txt
    • trainDataThankful.txt
    • training.xml
    • trainingCollege.xml
    • trainingStable.xml
    • trainingThanks.xml
  • Lending_Club_Demo_DF_(V12.1).zip—The dataflows for Spectrum Technology Platform 12.1.
    • LendingClub_2007_2016Q12_v121_MultipleCategorizers.df
    • LendingClub_2007_2016Q1Q2_v121_SingleCategorizer.df
    • LendingClub_2016Q3_v121_SingleCategorizer_Scoring.df
  • Lending_Club_Demo_DF_(V12.2).zip—The dataflows for Spectrum Technology Platform 12.2.
    • LendingClub_2007_2016Q12_v122_MultipleCategorizers.df
    • LendingClub_2007_2016Q1Q2_v122_SingleCategorizer.df
    • LendingClub_2016Q3_v122_SingleCategorizer_Scoring.df
  • ReadMe.txt—High-level descriptions and instructions for the previously mentioned files.

You can create your own dataflow by following the step-by-step instructions in the documentation, or you can use the included dataflows as references to confirm what the individual completed stages and dataflows as a whole should look like.