Developing InfoLink Applications

Structure of InfoLink Applications

In InfoLink, the user develops applications to implement various data integration, data quality, and data management projects. An application consists of the following components:

  • Sources contain source definitions. Each source definition has a name, type and a list of source-type-specific attributes used to access data in the source. For example, PgSQLSource type (a source type to connect to PostgreSQL database) has connection string (connStr), user, and password as source-type-specific parameters. Creating a source only means storing metadata about the source (namely, name, type, and source-type-specific parameters) - it does not initiate any jobs to extract data form the source. Data is extracted from and loaded to sources by various operations that take source definitions (referred by name) as their parameters.
  • Specifications define rules of various data integration tasks. Each specification has name, type, and type-specific parameters. A specification name must be unique across all types. In the InfoLink’s user interface, specifications are organized by type into sub-folders of the application navigation tree. For example, the Matching type specification contains rules to match and consolidate duplicate records in a table. For each specification type there is usually a corresponding operation to execute a specification of the type. Such an operation takes the name of a specification as a parameter and executes the specification. It can also take other parameters to execute a part of the specification or to execute it in a specific mode. For example, MatchOneSource is an operation that executes a Matching specification. The additional parameters of MatchOneSource that allow it to execute only matching rules (without consolidation rule) can be useful during development.
  • Scenarios define procedures that execute operations. More precisely, a scenario is a procedure consisting of a sequence of operation calls. Operations implement basic data integration tasks and are called from scenarios. For example, the LoadTableFromSource operation loads any table from any source into another source. Transform is a universal operation to implement various types of data transformation.
  • Scripts contain queries or scripts in a source-specific language, for example, an SQL query to a PosgreSQL database. Each script consists of a name, a source name, and script content in a language supported by the source. Scripts are passed as parameters (referred by name) to some operations. For example, the RunSourceScript operation executes a script.

Overview of InfoLink’s User Interface

Home Page

After you log in, you see the Home Page. The Home Page contains a list of applications. You can create a new application or delete an application. Click on an application name to see the application page.

Application Page

The Application Page allows you to manage all application components described in Structure of InfoLink Application. The Application Page has the following UI components:

  • The current window is the central part of the Application Page. You can have only one window open at any time, which is the current window.
  • The application navigation tree resides on the left. Using the navigation tree you can explore, view and manage the application components listed in Structure of InfoLink Application . To view or edit a component’s content, click on its name and it will be opened in the current window. To manage a component, right click on its name and select a management function from the menu.
  • The information bar resides at the top and contains the application name, ID, and status.

Saving Changes

When you edit an application, all changes are saved automatically in the background as you type or click – you do not have to click a save button. Do not worry about changing something by mistake, since all important changes (such as deleting an application or a specification) will ask for your confirmation. When you manage components in the navigation tree, they are created or deleted as soon as you confirm the change. If you edit the content of thecurrent window, status shown on the information bar (at the top) indicates the status of saving, (e.g. No changes, Saving…, All changes saved).

Managing Data Sources

Data sources are managed under the Sources section in the navigation tree.

Create a Data Source

  1. Right click on the Sources section in the navigation tree and select Create source.
  2. Enter the source name, select the source type, and click the Create button. The data source window will be opened.
  3. Enter source-specific parameters in the data source window.

Note: When you create a data source, you store metadata about the data source (usually it is connection information). Creating a data source does not initiate any operations on the data source. To perform operations, create operations in scenarios and execute them.

Delete a Data Source

  1. Right click on the source name in the Sources section and select Delete.
  2. Confirm that you want to delete the source by clicking the OK button.

Upload Libraries to a Data Source

Many operations (but not all) require libraries to be uploaded to the data source. Libraries usually consists of stored procedures implemented in a language supported by the data source.

  1. Right click on the source name in the Sources section and select Upload libraries.
  2. The current window will show a conformation message.

Managing Scenarios and Operations

A scenario is a procedure that consists of a sequence of operation calls. A scenario is a basic mechanism to perform operations on data sources. They are used to implement all kinds of applications: ETL, Data Quality, etc. Operations in scenarios may have operation-specific parameters and usually have data location parameters, such as: data source name (created in the Sources section), space name, table name, target source name, target space name, and target table name. The target parameters specify where the result of the operation will be stored. Many operations include the name of a specification to be executed (created in the Specification section) as a parameter. Such operations do not require source and target parameters as they are usually part of the specifications.

Create a Scenario

  1. Right click on the Scenarios section in thenavigation tree and select Create scenario.
  2. Enter the scenario name and click the Create button. The scenario will be created and added to the Scenarios section.

Delete a Scenario

  1. Right click on the source name in the Scenarios section and select Delete.
  2. Confirm that you want to delete the scenario by clicking the OK button.

Create an Operation in a Scenario

Operations are always part of a scenario. To create an operation in a scenario, either append it as the last operation in the scenario or insert the operation before or after an existing operation.

To append an operation to scenario:

  1. Right click on the scenario name, hover over Append operation, and select an operation. The current window will display the operation window.
  2. Enter any operation-specific parameters in the operation window.

To insert an operation before or after an existing operation:

  1. Expand the scenario by clicking on the > sign on the left of the operation name.
  2. Right click on the operation title (which usually takes the form“<operation name>:<name of the most descriptive parameter>), hover over Insert operation before or Insert operation after, and select an operation. The current window will display the operation window.
  3. Enter any operation-specific parameters in the operation window.

Delete an Operation

To delete an operation, right click on the operation title and select Delete.

Run a Scenario

A scenario is executed as an asynchronous job. When you start a scenario, you get the job’s identifier. Check the Activity Log to see the job’s status. The job will be listed by its identifier. TheActivity Log is not updated automatically; click the Activity Log to refresh its contents.

  1. Right click on the scenario name and select Run. The current window will show the identifier of the job executing the scenario.
  2. Click on the Activity Log in the navigation tree to see the job status. The job is listed by its identifier. The job status can be running, completed, or completedWithError. Click the Activity Log periodically to refresh the job status.
  3. To see current job progress details, expand the job’s record in the Activity Log. It will show a list of operations that were already completed.

Run an Operation

You can run a single operation in a scenario. This is useful for debugging purposes. The operation is executed as a synchronous job: the current window will show a spinner until execution is completed, then the result will be shown.

  1. Right click on the operation title and select Run. The current window will show the running status and then the result.