Connecting to Knox

An Apache Knox Gateway allows you to access a Hadoop service through the Knox security layer.

With this connection, you can create flows in the Enterprise Designer using stages in the Enterprise Big Data module to read data from and write data to Hadoop via Knox.

  1. Access the Data Sources page using one of these modules:
    Management Console:
    Access Management Console using the URL: http://server:port/managementconsole, where server is the server name or IP address of your Spectrum™ Technology Platform server and port is the HTTP port used by Spectrum™ Technology Platform.
    Note: By default, the HTTP port is 8080.
    Go to Resources > Data Sources.
    Metadata Insights:
    Access Metadata Insights using the URL: http://server:port/metadata-insights, where server is the server name or IP address of your Spectrum™ Technology Platform server and port is the HTTP port used by Spectrum™ Technology Platform.
    Note: By default, the HTTP port is 8080.
    Go to Connect.
  2. Click the Add connection button .
  3. In the Name field, enter a name for the connection. The name can be anything you choose.
    Note: Once you save a connection you cannot change the name.
  4. In the Type field, choose Gateway.
  5. In the Gateway Type field, choose Knox.
  6. In the Host field, enter the host name or IP address of the node in the HDFS cluster running the gateway.
  7. In the Port field, enter the port number for the Knox gateway.
  8. In the User Name field, enter the user name for the Knox gateway.
  9. In the Password field, enter the password to authorize you access to the Knox gateway.
  10. In the Gateway Name field, enter the name of the Knox gateway you wish to access.
  11. In the Cluster Name field, enter the name of the Hadoop cluster to be accessed.
  12. In the Protocol field, choose webhdfs.
  13. In the Service Name field, enter the name of the Hadoop service to be accessed.
  14. To test the connection, click Test.
  15. Click Save.

After you have defined a Knox connection to an HDFS cluster, the connection can be used in Enterprise Designer, in the stages Read from File and Write to File. You can select the HDFS cluster when you click Remote Machine when defining a file in a source or sink stage.