Sxgzip Compression Utility

Contains sxgzip, sxgunzip, sxzcat - multithreaded compression/decompression utilities.

Sxgzip compresses using threads to make use of multiple processors and cores. The input is broken into blocks each compressed independently in parallel using deflate compression method. The compression produces independent raw deflate streams which are wrapped with the appropriate gzip header and trailer. This adds an overhead to the output for each input block, but allows parallel decompression. The overhead comprises of a gzip header (10), trailer (8), SX header extension (24) bytes to be added to each input block, and zlib format wrapping and deflate overhead resulting from input stream division to separate chunks.

Download

The utilities are distributed within SpectX installation package. SpectX trial is available for download at https://www.spectx.com/#signup.

Installation

Requires Oracle’s Java Runtime Environment (JRE) 1.8 to be installed on the system. It is available for download from Oracle download site. Please note that using OpenJDK is not recommended, as it results in reduced performance. Please check your Java version first by running (and then installing/upgrading accordingly if needed):

$ java -version
  1. The installation package can be found in /tools subdirectory in the SpectX installation directory:

    ./
    └── spectx/
        ├── bin/
        ├── conf/
        ├── lib/
        └── tools/
            └── sxgziputil-v{version}.tar.gz
    
  2. Copy the installation package to desired directory and unpack:

    $ tar -zxf sxgziputil-v{version}.tar.gz
    

This will extract:

./
└── sxgziputil/
    ├── bin/        :contains scripts and binaries
    └── man/        :contains man pages
  1. Update your PATH environment variable with the path to sxgziputil/bin directory.

Man page

sxgzip

Name

sxgzip, sxgunzip, sxzcat — multithreaded compression/decompression tools

Synopsis

sxgzip -b blocksize -cdfhik -l level -n -p threads -r -S suffix -tvV file …

sxgunzip -cfhiknN -p threads -r -S suffix -tvV file …

sxzcat -hn -p threads -vV file …

DESCRIPTION

sxgzip compresses using threads to make use of multiple processors and cores. The input is broken up into chunks with each compressed in parallel using the deflate compression method. The compression produces independent raw deflate streams which are wrapped with the appropriate gzip header and trailer and are written in order to the output. This adds an overhead to the output for each input chunk, but allows parallel decompression. The overhead comprises of a gzip header (10), trailer (8), SX header extension (24) bytes to be added to each input block, and zlib format wrapping and deflate overhead resulting form input stream division to separate chunks.

The default input block size is 1000000 bytes, but can be changed with the -b option. The number of compress threads is set by default to max(2, available_cores / 2), which can be changed using the -p option.

Compressed files can be restored to their original form using sxgzip -d or sxgunzip. As the files are fully compatible with gzip format, they can also be processed with original gzip utility

In other aspects, the programs behave like original gzip utilities. If no files are specified, sxgzip will compress from standard input, or decompress to standard output. When in compression mode, each file will be replaced with another file with the suffix, set by the -S suffix option, added, if possible.

In decompression mode, each file will be checked for existence, as will the file with the suffix added. Each file argument must contain one separate complete sxgz archive; when multiple files are indicated, each is decompressed in turn.

In the case of sxzcat the resulting data is then concatenated in the manner of cat.

If invoked as sxgunzip then the -d option is enabled. If invoked as sxzcat then the -c, -d and -k options are enabled.

OPTIONS

-b,--blocksize <num>

length of plain text block in bytes, default is 1000000

-c,--stdout

write on standard output, keep original files unchanged

-d,--decompress

decompress files

-f,--force

force overwrite of output file

-h,--help

give this help

-i,--info

print chunk info, -p ignored

-k,--keep

keep (don’t delete) input files

-l,--level <num>

compression level: 1 fastest (worst) compression .. 9 best (slowest) compression

-n,--no-name,--no-crc

when compressing, no filename and timestamp from being stored in the output file; when decompressing, no crc checks

-N,--name

when decompressing, use filename stored in the input file (if any) as the output file name

-p,--num-threads <num>

number of threads to use, default is max(2, cores/2)

-r,--recursive

recursively process files in directories

-S,--suffix <SUF>

use suffix .SUF instead of .sx.gz on compressed files

-t,--test

test compressed file and print chunk info, -p ignored

-v,--verbose

verbose mode, print extra info to standard error

-V,--version

display program version