Compression Support for Hadoop

Spectrum™ Technology Platform supports the compression formats gzip (.gz) and bzip2 (.bz2) on Hadoop. While using the Read from File and Write to File stages with an HDFS connection, include the extension corresponding to the required compression format (.gz or .bz2) in the File name field. The file is decompressed or compressed based on the specified compression extension. Spectrum™ Technology Platform handles the compression and decompression of the files.