SQL Command

SQL Command executes one or more SQL commands for each record in the data flow. You can use SQL Command to:

  • Execute complex INSERT/UPDATE statements, such as statements that have subqueries/joins with other tables.
  • Update tables after inserting/updating data to maintain referential integrity.
  • Update or delete a record in a database before a replacement record is loaded.
  • Update multiple tables in a single transaction.

You can execute additional SQL commands before and after executing the main SQL commands, and you can invoke stored procedures.

Note: To execute a stored procedure, use this syntax:

Call <Procedure Name>

Stored procedures invoked from SQL Command must not use OUT parameters.

Note: Significant performance improvements can be achieved by using multiple runtime instances of SQL Command. To specify multiple runtime instances, click the Runtime button.

General

The General tab is where you specify dynamic SQL statements that you want to execute once for each record. The following table lists the options available on the General tab.

Option Description

Connection

Select the database connection you want to use. Your choices vary depending on what connections are defined in the Connection Manager of Spectrum Management Console. If you need to make a new database connection, or modify or delete an existing database connection, click Manage Connections.

If you are adding or modifying a database connection, complete these fields:
Connection name
Enter a name for the connection. The name can be anything you choose.
Database driver
Select the appropriate database type.
Connection options
Specify the host, port, instance, user name, and password to use to connect to the database.

SQL statements

Enter the SQL statements you want to run for each record in the dataflow. As you begin to type, an auto-complete pop-up window will display the valid SQL commands. Separate multiple SQL statements with a semicolon (;).

To specify a value from a dataflow field, use this syntax:

${<field name>}

Where <field name> is the name of a field in the data flow.

For example,

UPDATE MyDatabase.dbo.customer
SET name=${Name}
WHERE id=${ID};

In this example ${Name} will be replaced with the value from the dataflow's Name field and ${ID} will be replaced with the value from the dataflow's ID field.

Note: Queries must use the fully-qualified name. For example, MyDatabase.dbo.customer.

Transaction processing

Specifies whether to process records in batches or to process all records at the same time. One of these:

Batch size
Groups records into batches of the size you specify and processes one batch at a time.
Entire Run
Creates one large batch for all records and processes all transactions at the same time.

Error processing

Specifies what to do if an error is encountered while executing the SQL commands. One of the following:

Do not stop the data flow on error
The data flow continues to run if the database returns an error while executing the SQL commands.
Stop the data flow after encountering this many errors
The data flow will stop running after the database returns the specified number of errors.
Note: If there is a syntax error in the SQL, the data flow will always stop regardless of which setting you choose here.

In addition, you can optionally write error records to a sink by connecting the SQL Command error port to the type of sink you want. The error port is the white triangle on the right side of the stage icon in the data flow. For example, to write error records to a flat file, you would connect the SQL Command error port to a Write to File stage, as shown here:

Pre/Post SQL

The Pre/Post SQL tab is where you specify SQL statements that you want to execute once per data flow run, as opposed to once per record as is the case with the SQL you specify on the General tab. The following table lists the options available on the Pre/Post SQL tab.

Option Description

Pre-SQL

Type one or more SQL statements that you want to execute before the records coming into the stage are processed. The SQL statements you enter here are executed once per run after the data flow starts running but before the SQL Command stage processes the first records.

An example use of pre-SQL would be to create a table for the records that will be processed.

Autocommit pre-SQL

Check this box to commit the pre-SQL statements before executing the SQL statements on the General tab.

If you do not check this box, the pre-SQL statements will be committed in the same transaction as the SQL statements on the General tab.

Note: If you check neither the Autocommit pre-SQL nor the Autocommit post-SQL boxes, then all SQL statements for the stage are committed in one transaction.
Post-SQL

Type one or more SQL statements that you want to execute after all the records are processed. The SQL statements you enter here are executed once per run after the SQL Command stage is finished but before the data flow finishes.

An example use of pre-SQL would be to build an index after processing the records.

Autocommit post-SQL

Check this box to commit the post-SQL statements in their own transaction after the SQL commands on the General tab are committed.

If you do not check this box, the post-SQL statements will be committed in the same transaction as the SQL statements on the General tab.

Note: If you check neither the Autocommit pre-SQL nor the Autocommit post-SQL boxes, then all SQL statements for the stage are committed in one transaction.

Runtime Tab

The Runtime tab displays Stage Options and gives you the flexibility of defining default values for the stage options.

Field Name

Description

Stage Options
This section lists the dataflow options used in the SQL query of this stage and allows you to provide a default value for all these options. The Name column lists the options while you can enter the default values in the corresponding Value column.
Note: The default value provided here is also displayed in the Map dataflow options to stages section of the Dataflow Options dialog box. The dialogue box also allows you to change the default value. In case of a clash of default values provided for an option through Stage Options, Dataflow Options, and Job Executor the order of precedence is: Value provided through Job Executor > Value defined through the Dataflow Options dialogue box > Value entered through the Stage Options.