Populating a Dimension Table

Dimension tables are part of a star schema and contain detailed information for the columns in the fact table. Dimension tables have attributes and a single part primary key that joins the dimension table to the fact table. The single part primary key allows you to quickly browse a single dimension table. Browsing a dimension table can help determine the best way to query the fact table.

The following procedure describes how to use Spectrum Technology Platform to perform the initial population of a dimension table in your data warehouse.

Note: Before beginning this procedure, you must have defined connections to external resources you want to use as the source for the dimension table if you are using a database, file server, or web service as the source for the data. You must also define a connection to the data warehouse in which you want to create a dimension table. If you have not defined the necessary connections, see Connections.
  1. In your data warehouse, create the table that you want to use as a dimension table.
  2. In Management Console, create connections to your data source and the data warehouse.
  3. In Enterprise Designer, select File > New > Dataflow > Job.
  4. Drag the source stage onto the canvas.
    • To use data from a database to populate the table, drag the Read from DB stage onto the canvas.
    • To use data from a flat file to populate the table, drag the Read from File stage onto the canvas.
    • To use data from a variable format file to populate the table, drag the Read from Variable Format File stage onto the canvas.
    • To use data from an XML file to populate the table, drag the Read from XML stage onto the canvas.
  5. Double-click the source stage you just placed on the canvas and configure it to point to the source of the data you want to populate to the dimension table.
  6. Drag a Unique ID Generator stage onto the canvas and connect the source stage to it. For example, if you are using Read from DB as the source stage, you would connect Read from DB to Unique ID Generator.
  7. Double-click the Unique ID Generator stage on the canvas and configure it to create a surrogate key.
    Note: Typically the key from the operational system is not used as the primary key for a dimension in the warehouse. This helps maintain historical consistency because a key value might change in the operational system.
  8. Drag a Write to DB stage onto the canvas and connect Unique ID Generator to it.
  9. Double-click the Write to DB stage on the canvas and configure it to point to the database and dimension table that you want to populate. For information on configuring Write to DB, see Write to DB.
  10. Select File > Save and save the dataflow.
  11. To run the dataflow now and populate the dimension table, selectRun > Run Current Flow.