Data Settings
The Data Settings page allows you to view and edit settings that characterize a dataset.
This settings page is displayed when you edit a dataset. Settings configured in the Data Settings section characterize the data. The Preview section displays the data correctly if the settings match the record format for the data.
Data Settings
- Character encoding
- Specifies the character encoding used in the source file.
- Field delimiter
- Specifies the character that separates values in a record.
- Text qualifier
- Specifies the character that encloses values that contain the delimiter.
- Line separator
- Specifies whether the file uses Windows or Unix line breaks.
- First row is a header row
- Select this check box if the first row contains column labels. Clear the check box if the first row contains data.
Field definition table
Settings in this table characterize fields in the dataset.
- Field Name
- This value specifies the field name. Field names must start with an alphabetic character. They may contain alphanumeric, dash (-) and underscore (_) characters. Spaces and other non-alphanumeric characters are not permitted.
- Data Type
- Entries in this column define the primitive data type for a field. Choose the appropriate type corresponding to data in a field. You can choose from Boolean, Float, Integer, Long, String, Date, Time, or DateTime. When you create a dataset, the data type is initially set for each field based that matches data type formats for the dataset. You can click to Edit data type formats button to change the default data type formats for a dataset.
- Semantic Type
- Semantic types describe the kind of information that data represents. For example, a field with a float type may semantically represent a currency, and a field with a string type may semantically represent a city. When you edit a pipeline, this setting determines which transforms show in the Transforms Suggestions.
Preview
The Preview table displays sample data from the data source. Field names and data will display correctly in separate columns if the data settings match the data format. If the data does not display correctly you can adjust the data settings to match the data. For example, data will appear in a single column if the Period (.) is selected for Field delimiter if data is uploaded as comma separated values.
The header row above the table shows column headings. If the First row is header row check box is selected, the header row initially shows values from the first row of the input data. If you the First row is header row check box, it initially shows Column_1, Column_2, and so forth.