An Apache Knox Gateway allows you to access a Hadoop service through the Knox
security layer.
With this connection, you can create flows in the Spectrum Enterprise Designer using stages in
Enterprise Big Data to read data from and write data to Hadoop via
Knox.
-
Access the Connections page using one of these:
- Spectrum Management Console:
- Access Spectrum Management Console using the URL:
http://server:port/management
console, where server is the server name or IP
address of your Spectrum Technology Platform server and
port is the HTTP port used by Spectrum Technology Platform.
Note: By default, the HTTP port is
8080.
- Click .
- Spectrum Discovery:
- Access Spectrum Discovery using the URL: http://server:port/discovery, where server is the server name or IP address of your Spectrum Technology Platform server and port is the HTTP port used by Spectrum Technology Platform.
Note: By default, the HTTP port is 8080.
- Click Connect.
-
Click the Add connection button .
-
In the Connection Name box, enter a name for the connection. The name can be anything you choose.
Note: Once you save a connection you cannot change the name.
-
In the Connection Type field, choose Gateway.
-
In the Gateway Type field, choose
Knox.
-
In the Host field, enter the host name or IP address of
the node in the HDFS cluster running the gateway.
-
In the Port field, enter the port number for the Knox
gateway.
-
In the User Name field, enter the user name for the Knox
gateway.
-
In the Password field, enter the password to authorize
you access to the Knox gateway.
-
In the Gateway Name field, enter the name of the Knox
gateway you wish to access.
-
In the Cluster Name field, enter the name of the Hadoop
cluster to be accessed.
-
In the Protocol field, choose
webhdfs.
-
In the Service Name field, enter the name of the Hadoop
service to be accessed.
-
To test the connection, click Test.
-
Click Save.
After you have defined a Knox connection to an HDFS cluster, the connection can be
used in Spectrum Enterprise Designer, in the stages Read from File and
Write to File. You can select the HDFS cluster when you click
Remote Machine when defining a file in a source or sink
stage.