Home
Accessing Stages Through Enterprise Designer
Access different stages through Enterprise Designer.
Read From File
Character Encodings
This topic describes character encodings in various file formats.

Welcome
Explore Data Integration. Connect different data sources and various stages used in Data Integration process.
Introduction
Learn about data management architecture and the star schema data warehouse design.
Connecting to Data Sources and Data Warehouses
Connect to different types of data sources.
Populating the Data Warehouse
Prepare your data for warehouse. Populate Time Dimension table, Dimension table, Fact table. Add time stamp to records in a Data Warehouse.
Updating the Data Warehouse
Define a Data Warehouse update schedule. Update a fact table. Use Global and Local cache for queries.
Accessing Stages Through Enterprise Designer
Access different stages through Enterprise Designer.
- Call Stored Procedure
- DB Change Data Reader
  The DB Change Data Reader stage allows you to select the columns to be included in the current jobflow, where the columns have the Change Data Capture feature enabled on them.
- DB Loader
  The DB Loader stage allows you to access and load data from/to databases configured in Spectrum Data Integration. This stage provides an interface to a high-speed data loading utility. Currently, the Spectrum Data Integration platform supports Oracle Loader, DB2 Loader, PostgreSQL Loader, and Teradata Loader.
- Field Parser
- Field Combiner
- Field Selector
- Generate Time Dimension
- Query Cache
- Query DB
- Query NoSQL DB
- Read From DB
- Read From File
  - Defining Fields In a Delimited Input File
  - Defining Fields In a Line Sequential or Fixed Width File
  - Sorting Input Records
  - The File Definition Settings File
  - Configuring Dataflow Options
  - Character Encodings
    This topic describes character encodings in various file formats.
- Read from Hadoop Sequence File
- Read From Hive File
- Read from HL7 File
- Read from NoSQL DB
- Read from SAP
- Read from Spreadsheet
- Read from Variable Format File
- Read From XML
- SQL Command
- Transposer
- Unique ID Generator
- Write to Cache
- Write to DB
- Write to File
- Write to Hadoop Sequence File
- Write to Hive File
- Write to NoSQL DB
- Write to Spreadsheet
- Write to Variable Format File
- Write to XML
- Date and Number Patterns
Configurations
Configure Oracle LogMiner, and HDFS connection for HA Cluster.
Optimizing Performance
Improve performance by determining an optimum fetch size. Learn about best practices for connecting to HDFS 3.x and Hive 2.1.1.

Character Encodings

This topic describes character encodings in various file formats.

CP1252: This encoding is also known as the Windows-1252 or simply Windows character set. It is a super set of ISO-8859-1 and uses the 128-159 code range to display additional characters not included in the ISO-8859-1 character set.
UTF-8: Supports all Unicode characters and is backwards-compatible with ASCII. For more information about UTF, see unicode.org/faq/utf_bom.html.
UTF-16: Supports all Unicode characters but is not backwards-compatible with ASCII. For more information about UTF, see unicode.org/faq/utf_bom.html.
US-ASCII: A character encoding based on the order of the English alphabet.
UTF-16BE: UTF-16 encoding with big endian byte serialization (most significant byte first).
UTF-16LE: UTF-16 encoding with little endian byte serialization (least significant byte first).
ISO-8859-1: An ASCII character encoding typically used for Western European languages. Also known as Latin-1.
ISO-8859-3: An ASCII character encoding typically used for Southern European languages. Also known as Latin-3.
ISO-8859-9: An ASCII character encoding typically used for Turkish language. Also known as Latin-5.
CP850: An ASCII code page used to write Western European languages.
CP500: An EBCDIC code page used to write Western European languages.
Shift_JIS: A character encoding for the Japanese language.
MS932: A Microsoft's extension of Shift_JIS to include NEC special characters, NEC selection of IBM extensions, and IBM extensions.
CP1047: An EBCDIC code page with the full Latin-1 character set.