Sorting Records with Sorter

The Sorter stage allows you to sort records using the fields you specify.

  1. Under Control Stages, drag Sorter to the canvas, placing it where you want on the dataflow.
  2. Double-click Sorter.
  3. Click Add.
  4. Click the down-arrow in the Field Name column and select the field that you want to sort on.
    Note: The list of available fields is based on the fields used in the previous stages in the dataflow.
  5. In the Order column, choose whether you want to sort in ascending or descending order.
  6. In the Type column, select the field's data type.
    Note: If your incoming data is not in string format, the Type column will be disabled.
    bigdecimal
    A numeric data type that supports 38 decimal points of precision. Use this data type for data that will be used in mathematical calculations requiring a high degree of precision, especially those involving financial data. The bigdecimal data type supports more precise calculations than the double data type.
    double
    A numeric data type that contains both negative and positive double precision numbers between 2-1074 and (2-2-52)×21023. In E notation, the range of values is -1.79769313486232E+308 to 1.79769313486232E+308.
    float
    A numeric data type that contains both negative and positive single precision numbers between 2-149 and (2-223)×2127. In E notation, the range of values -3.402823E+38 to 3.402823E+38.
    integer
    A numeric data type that contains both negative and positive whole numbers between -231 (-2,147,483,648) and 231-1 (2,147,483,647).
    long
    A numeric data type that contains both negative and positive whole numbers between -263 (-9,223,372,036,854,775,808) and 263-1 (9,223,372,036,854,775,807).
    string
    A sequence of characters.
  7. To remove blank space from before and after the value before sorting, check the box in the Trim column. The trim option does not modify the value of the field. It only trims the value for the purpose of sorting. Note that if your incoming data is not in string format, the Trim column will be disabled.
  8. In the Treat Null As column, select Largest or Smallest to indicate the placement of null values in the sorted list. The placement depends on the combination of options selected in the Order and Treat Null As fields, as shown in the table below:
    Order Treat Null As Placement of null values in the sorted list
    Ascending Largest Bottom of the list
    Ascending Smallest Top of the list
    Descending Largest Top of the list
    Descending Smallest Bottom of the list
  9. Repeat until you have added all the fields you want to sort.
  10. Rearrange the sort order as desired by clicking Up or Down. This allows you to sort first by one field, then sort the resulting order again by another field.
  11. If you want to override the default sort performance options that have been defined by your administrator, click Advanced, check the Override sort performance options box, then specify the following options:
    In memory record limit
    Specifies the maximum number of data rows a sorter will hold in memory before it starts paging to disk. By default, a sort of 10,000 records or less will be done in memory and a sort of more than 10,000 records will be performed as a disk sort. The maximum limit is 100,000 records. Typically an in-memory sort is much faster than a disk sort, so this value should be set high enough so that most of the sorts will be in-memory sorts and only large sets will be written to disk.
    Note: Be careful in environments where there are jobs running concurrently because increasing the In memory record limit setting increases the likelihood of running out of memory.
    Maximum number of temporary files
    Specifies the maximum number of temporary files that may be used by a sort process. Using a larger number of temporary files can result in better performance. However, the optimal number is highly dependent on the configuration of the server running Spectrum™ Technology Platform. You should experiment with different settings, observing the effect on performance of using more or fewer temporary files. To calculate the approximate number of temporary files that may be needed, use this equation:
    (NumberOfRecords × 2) ÷ InMemoryRecordLimit = NumberOfTempFiles 
    Note that the maximum number of temporary files cannot be more than 1,000.
    Enable compression
    Specifies that temporary files are compressed when they are written to disk.
    Note: The optimal sort performance settings depends on your server's hardware configuration. Nevertheless, the following equation generally produces good sort performance:
    (InMemoryRecordLimit × MaxNumberOfTempFiles ÷ 2) >= TotalNumberOfRecords
  12. Click OK.
    Note: You can remove the sort criteria as desired by highlighting a row and clicking Remove.