Filtering Input Records

In the Read from Hadoop Sequence File stage, the Filter tab defines fields by which to filter the input records before they are sent into the dataflow. Filtering is optional.

  1. In Read from Hadoop Sequence File, click the Filter tab.
  2. In the Combine expression method field, choose All if you want all the expressions to evaluate to true in order for the record to be routed to this port; select Any if you want records to be routed to this port if one or more of the expressions is true.
  3. Click Add, specify the field to test, the operator, and a value. The operators are listed in the table, below.
    Operator Description

    Is Equal

    Checks if the value in the field matches the value specified.
    Is Not Equal

    Checks if the value in the field does not match the value specified.

    Is Greater Than

    Checks if the field has a numeric value that is greater than the value specified. This operator works on numeric data types as well as string fields that contain numbers.

    Is Greater Than Or Equal To

    Checks if the field has a numeric value that is greater than or equal to the value specified. This operator works on numeric data types as well as string fields that contain numbers.

    Is Less Than

    Checks if the field has a numeric value that is less than the value specified. This operator works on numeric data types as well as string fields that contain numbers.
    Is Less Than Or Equal To

    Checks if the field has a numeric value that is less than or equal to the value specified. This operator works on numeric data types as well as string fields that contain numbers.

    Is Null Checks if the field is a null value.

    Is Not Null

    Checks if the field is not a null value.
  4. Select the Trim option as desired. This option, first trims all the white spaces that may be present before and after the value of the field, before filtering the data in the field.
  5. Repeat until you have added all the input fields you wish to use for filtering.