Enterprise Data Integration
These stages are available to you if you have licensed for the Enterprise Data Integration module. Use these to connect to data in multiple sources either directly or through integration with your existing data access technologies.
Stage | Icon | Description |
---|---|---|
DB Loader |
|
The DB Loader stage allows you to access and load data from and to the configured databases. This stage provides an interface to a high-speed data loading utility. Currently, the platform supports Oracle Loader, DB2 Loader, PostgreSQL Loader, and Teradata Loader |
Field Selector |
|
Use the Field Selector stage to choose the fields that are to be passed to the next stage in the dataflow and remove the unwanted ones. For example, if you have created a new field by combining the data from two fields, and you no longer need the two source fields, you can use the Field Selector to retain only the new field and remove the two source fields from the dataflow. |
Generate Time Dimension |
|
Generate Time Dimension creates date records, one for each day of the date range you specify. You can then write these records to a time dimension table in a database using the Write to DB stage. The time dimension table can then be used to perform accurate calculations for time period. |
Query DB |
|
The Query DB stage allows you to use fields as parameters into a database query and return the results of the query as new fields in the dataflow. |
Read from DB |
|
The Read From DB stage reads data from a database table/view
as input to a dataflow. The stage is available for jobs, services, and subflows but
not for process flows.
Note: The stage supports reading data from
and writing data to HDFS 3.x and Hive 2.1.1. The support includes:
Also see Best Practices for connecting to HDFS 3.x and Hive 2.1.1. |
Read from Spreadsheet |
|
The Read from Spreadsheet stage reads data from a spreadsheet,
for example, an Excel spreadsheet, as input to a dataflow. It supports these two formats:
|
SQL Command |
|
SQL Command runs one or more SQL commands for each record in the data flow. You can use SQL
Command to:
|
Write to Spreadsheet |
|
The Write to Spreadsheet stage writes data to a spreadsheet, such as an Excel spreadsheet, as output from a dataflow. |
Write to DB |
|
The Write to DB stage writes the output of a dataflow to a database.
The stage writes all values of the
date datatype as
String values. This is the behavior of the jTDS
driver, which is the default driver used by Spectrum. To handle all
date datatype values as is, use Microsoft's JDBC driver. Note: The stage supports reading data from
and writing data to HDFS 3.x and Hive 2.1.1. The support includes:
Also see Best Practices for connecting to HDFS 3.x and Hive 2.1.1. |