Introduction

Physical Architecture

InfoLink consists of the following components:

  • InfoLink Server : InfoLink implementation, a Java EE 6 web application certified with Apache Tomcat 9.0.0 or higher;
  • Metadata repository : a database that contains metadata about all applications.

Data sources are external systems, not part of InfoLink, that are accessed and managed by InfoLink. All data sources are treated equally and accessed through an unified interface. In particular, there is not any special kind of source such as Stage database, which is used for intermediate data storage and transformations. You can use any data source (e.g. Oracle, PostgreSQL, or Hadoop) as a Stage database.

InfoLink Unified Data Model (UDM)

InfoLink provides an unified access to all data sources by mapping their data structures onto the InfoLink unified data model (UDM). InfoLink UDM consists of the following concepts:

  • Table - a named set of records. Objects stored in data sources are represented as tables in InfoLink. For example, CSV Files onAmazon S3, Salesforce objects (e.g. Account, User, etc.), and tables in a relational database are all represented as tables in InfoLink.
  • Table schema - an ordered list of columns that defines the structure of a table (and all its records).
  • Record - an ordered list of a column’s values.
  • Space - a named collection of tables. For example, schema in PostgreSQL is represented as space in the InfoLink UDM.
  • Source - a named collection of spaces.
  • Format specification - specifies archive (e.g. ZIP, etc.) and content formats (e.g. CSV, XML, etc.) of a table. Format specification is provided by the user as a parameter of the Load operation to describe the format of a file in a file-based source (see the Universal Load Operation ).