Adding a Time Stamp to Records in a Data Warehouse

A convenient way to assure data quality is to flag records in the data warehouse according to the date they were loaded. If the load process does not complete or you notice problems after data is already loaded, a time stamp column makes it easy to identify which records were affected. You can then delete all the records processed during a particular phase, return to the pre-load state and address any problems before attempting to load the data again. You can time stamp the load operation by adding an extra column, such as load_date, to your fact table using the SQL Command stage.

To have the dataflow add a timestamp when populating or updating a data warehouse:

  1. In Enterprise Designer, open the dataflow that populates or updates the data warehouse:
  2. Drag a Transformer stage to the canvas and connect it to the dataflow just before the Write to DB stage.

    For example:



  3. Double-click the Transformer stage.
  4. Click Add.
  5. Under General, select Custom.
  6. In the Custom transform name field, enter a name for this transform. The name can be anything you want. For example, Add Time Stamp.
  7. In the Custom script field, enter this:
    data['<timestamp field>']=currentDateTime()

    Where <timestamp field> is the name of the dataflow field that you want to contain the time stamp.

    For example, if you want to put the time stamp in a dataflow field named Timestamp then your custom script would be:

    data['Timestamp']=currentDateTime()
  8. Click the Add button at the bottom of the window.
  9. Click Close.
  10. Click OK to close the Transformer Options window.

The dataflow now adds the current time to a field in each record, providing a time stamp in the data warehouse that shows when each record was loaded.