Defining Model Properties

  1. Under Primary Stages > Deployed Stages > Machine Learning, click the PCA Options stage and drag it onto the canvas, placing it where you want on the dataflow and connecting it to other stages.
    Note: The input stage must be the data source that contains the principal components for your model. An output stage is not required but you may connect one if you wish to capture your output independent of the Machine Learning Model Management tool.
  2. Double-click the PCA Options stage to show the PCA Options dialog box.
  3. Enter a Model name if you do not want to use the default name.
  4. Optional: Check the Overwrite box to overwrite the existing model with new data.
  5. Enter the number of Principal components you want your model to contain.
  6. Optional: Enter a Description of the model.
  7. In the Inputs table click "Include" for each field whose data you want added to the model.
  8. Use the Model Data Type drop-down to specify whether the input field is to be used as a categorical, datetime, numeric, string, or uniqueid field.
  9. Click OK to save the model and configuration or continue to the next tab.