Read From Hive File
- Connectivity to HDFS and Hive from Spectrum on Windows
- Support and connectivity to Hadoop 3.x from Spectrum with high availability
- Kerberos-enabled HDFS connectivity through Windows
- Support of Datetime datatype in the Parquet file format
Also see Configuring HDFS Connection for HA Cluster and Best Practices for connecting to HDFS 3.x and Hive 2.1.1.
Related task:
Connecting to Hadoop: To be able to use Read from Hive File stage, you need to create a connection to the Hadoop file server. Once you do that, the name by which you save the connection is displayed as the server name.
File Properties tab
Fields | Description |
---|---|
Server | Indicates the file you select in the File name field is
located on the Hadoop system. Note: You need to create a connection to the Hadoop
file server before using it here. For details on creating connection, see Connecting to Hadoop. If you select a file on the Hadoop system, the server name will be the
name you specify while creating a file server. |
File name | Specifies the path to the file. Click the ellipses button (...) to browse to
the file you want. You may, however, rename the columns of the schema as required.
The first 50 records of the file are fetched in the Preview
grid on selecting the file. Note: The schema of an input file is imported as soon as
you go to the correct location and select the file. This imported schema cannot be
edited. |
File type | Select the type of the file being read:
|
Fields tab
The Fields tab defines the names, datatypes, positions of the fields as present in the input file, as well as the user-given names for the fields. For more information, see Defining Fields for Reading from Hive File.