Best Practices for connecting to HDFS 3.x and Hive 2.1.1
- While creating tables and defining schema, use lower casing. Example: Create Table demo ( id int, name string, salary int).
- Use fully qualified field names when using Hive JDBC connection with Read from DB stage. Else, for reading from DB and writing to it, create a Model Store connection.
- Location for Keytab file to specify forwarded slashes / when connecting from Windows.
- Avoid spaces in Hive file, table, and field names.
- Use the property grid for creating the HDFS connection
- Creating HDFS connection
- If your cluster is secured, or in other words, it is HA Enabled or SSL Enabled, ensure this: The value of the property = fs.defaultFS is swebhdfs://<Nameservice name>, where swebhdfs stands for secured webhdfs.
- If cluster is not secured, you can use webhdfs. Besides, nameservice needs to be provided as it is used for other properties.
- This entry needs to be made in the Wrapper.conf file: wrapper.java.additional.7=-Djavax.net.ssl.trustStore=<Path of truststore file>/<name of the truststore file>