648db22b |
1 | # Parallel Zstandard (PZstandard) |
2 | |
3 | Parallel Zstandard is a Pigz-like tool for Zstandard. |
4 | It provides Zstandard format compatible compression and decompression that is able to utilize multiple cores. |
5 | It breaks the input up into equal sized chunks and compresses each chunk independently into a Zstandard frame. |
6 | It then concatenates the frames together to produce the final compressed output. |
7 | Pzstandard will write a 12 byte header for each frame that is a skippable frame in the Zstandard format, which tells PZstandard the size of the next compressed frame. |
8 | PZstandard supports parallel decompression of files compressed with PZstandard. |
9 | When decompressing files compressed with Zstandard, PZstandard does IO in one thread, and decompression in another. |
10 | |
11 | ## Usage |
12 | |
13 | PZstandard supports the same command line interface as Zstandard, but also provides the `-p` option to specify the number of threads. |
14 | Dictionary mode is not currently supported. |
15 | |
16 | Basic usage |
17 | |
18 | pzstd input-file -o output-file -p num-threads -# # Compression |
19 | pzstd -d input-file -o output-file -p num-threads # Decompression |
20 | |
21 | PZstandard also supports piping and fifo pipes |
22 | |
23 | cat input-file | pzstd -p num-threads -# -c > /dev/null |
24 | |
25 | For more options |
26 | |
27 | pzstd --help |
28 | |
29 | PZstandard tries to pick a smart default number of threads if not specified (displayed in `pzstd --help`). |
30 | If this number is not suitable, during compilation you can define `PZSTD_NUM_THREADS` to the number of threads you prefer. |
31 | |
32 | ## Benchmarks |
33 | |
34 | As a reference, PZstandard and Pigz were compared on an Intel Core i7 @ 3.1 GHz, each using 4 threads, with the [Silesia compression corpus](https://sun.aei.polsl.pl//~sdeor/index.php?page=silesia). |
35 | |
36 | Compression Speed vs Ratio with 4 Threads | Decompression Speed with 4 Threads |
37 | ------------------------------------------|----------------------------------- |
38 | ![Compression Speed vs Ratio](images/Cspeed.png "Compression Speed vs Ratio") | ![Decompression Speed](images/Dspeed.png "Decompression Speed") |
39 | |
40 | The test procedure was to run each of the following commands 2 times for each compression level, and take the minimum time. |
41 | |
42 | time pzstd -# -p 4 -c silesia.tar > silesia.tar.zst |
43 | time pzstd -d -p 4 -c silesia.tar.zst > /dev/null |
44 | |
45 | time pigz -# -p 4 -k -c silesia.tar > silesia.tar.gz |
46 | time pigz -d -p 4 -k -c silesia.tar.gz > /dev/null |
47 | |
48 | PZstandard was tested using compression levels 1-19, and Pigz was tested using compression levels 1-9. |
49 | Pigz cannot do parallel decompression, it simply does each of reading, decompression, and writing on separate threads. |
50 | |
51 | ## Tests |
52 | |
53 | Tests require that you have [gtest](https://github.com/google/googletest) installed. |
54 | Set `GTEST_INC` and `GTEST_LIB` in `Makefile` to specify the location of the gtest headers and libraries. |
55 | Alternatively, run `make googletest`, which will clone googletest and build it. |
56 | Run `make tests && make check` to run tests. |