Sorting Input Records

In the Read from File stage, the Sort Fields tab defines fields by which to sort the input records before they are sent into the dataflow. Sorting is optional.

  1. On the Sort Fields tab, click Add.
  2. Click the drop-down arrow in the Field Name column and select the field you want to sort by. The fields available for selection depend on the fields defined in this input file.
  3. In the Order column, select Ascending or Descending.
  4. Repeat until you have added all the input fields you wish to use for sorting. Change the order of the sort by highlighting the row for the field you wish to move and clicking Up or Down.
  5. Default sort performance options for your system are set in Spectrum Management Console. If you want to override your system's default sort performance options, click Advanced. The Advanced Options dialog box contains these sort performance options:
    In memory record limit
    Specifies the maximum number of data rows a sorter will hold in memory before it starts paging to disk. By default, a sort of 10,000 records or less will be done in memory and a sort of more than 10,000 records will be performed as a disk sort. The maximum limit is 100,000 records. Typically an in-memory sort is much faster than a disk sort, so this value should be set high enough so that most of the sorts will be in-memory sorts and only large sets will be written to disk.
    Note: Be careful in environments where there are jobs running concurrently because increasing the In memory record limit setting increases the likelihood of running out of memory.
    Maximum number of temporary files
    Specifies the maximum number of temporary files that may be used by a sort process. Using a larger number of temporary files can result in better performance. However, the optimal number is highly dependent on the configuration of the server running Spectrum Technology Platform. You should experiment with different settings, observing the effect on performance of using more or fewer temporary files. To calculate the approximate number of temporary files that may be needed, use this equation:
    (NumberOfRecords × 2) ÷ InMemoryRecordLimit = NumberOfTempFilesN
    Note: The maximum number of temporary files cannot be more than 1,000.
    Enable compression
    Specifies that temporary files are compressed when they are written to disk.
    Note: The optimal sort performance settings depends on your server's hardware configuration. You can use this equation as a general guideline to produce good sort performance: (InMemoryRecordLimit × MaxNumberOfTempFiles ÷ 2) >= TotalNumberOfRecords