Best Practices for connecting to HDFS 3.x and Hive 2.1.1

  • While creating tables and defining schema, use lower casing. Example: Create Table demo ( id int, name string, salary int).
  • Use fully qualified field names when using Hive JDBC connection with Read from DB stage. Else, for reading from DB and writing to it, create a Model Store connection.
  • Location for Keytab file to specify forwarded slashes / when connecting from Windows.
  • Avoid spaces in Hive file, table, and field names.
  • Use the property grid for creating the HDFS connection
  • Creating HDFS connection
    • If your cluster is secured, or in other words, it is HA Enabled or SSL Enabled, ensure this: The value of the property = fs.defaultFS is swebhdfs://<Nameservice name>, where swebhdfs stands for secured webhdfs.
    • If cluster is not secured, you can use webhdfs. Besides, nameservice needs to be provided as it is used for other properties.
  • This entry needs to be made in the Wrapper.conf file: wrapper.java.additional.7=-Djavax.net.ssl.trustStore=<Path of truststore file>/<name of the truststore file>