You should always consider using compression to save disc space and accelerate reading and writing data. Here a few things to consider
- Do not compress files smaller than 4k
- Many small files should be packed in a tar archive and compressed to a single file
- Decompression is normally much faster than compression
- Using transparent compression in simulations improves performance in most cases
- Better compression takes more time and more memory (but this is normally not a problem)
- Use a suitable algorithm (compressed size vs. runtime tradeoff): see benchmarks below
- xz may not be a good choice (see here)
- If in doubt: use plzip for fast and good compression, zpaq -m 5 for best compression
Usage
Single file compression:
gzip (-9) file.dat -> file.dat.gz
(p)bzip2 file.dat -> file.dat.bz2
(p)lzip (-9) file.dat -> file.dat.lz
xz (-9) (-T N) file.dat -> file.dat.xz
7z a archive.7z file.dat ->archive.7z
zpaq (-m 5) a archive.zpaq file.dat -> archive.zpaq
Single file decompression:
gzip -d file.dat.gz -> file.dat
(p)bzip2 -d file.dat.bz2 -> file.dat
(p)lzip -d file.dat.lz -> file.dat
xz -d file.dat.xz -> file.dat
7z x archive.7z file.dat -> file.dat
zpaq x archive.zpaq file.dat -> file.dat
Directory compression:
tar zcf data.tar.gz data/
tar jcf data.tar.bz2 data/
tar --lzip cf data.tar.lz data/
tar Jcf data.tar.xz data/
7z a data.7z data/
zpaq (-m 5) a data.zpaq data/
or first create tar-archive with tar cf data.tar data/ and compress as single file.
Example benchmarks
program/file | kernel 5.2.10 source | ASCII Data | blender executable |
---|---|---|---|
size (compr./decompr. time) | |||
uncompressed | 832M | 97M | 58M |
gzip | 159M (21 s/4 s) | 41M (6.7 s/0.7 s) | 23M (2.8 s/0.4 s) |
gzip -9 | 157M (39 s/4 s) | 41M (12 s/ 0.7 s) | 23M (9 s/0.4 s) |
bzip2 | 122M (61 s/18 s) | 36M (7.1 s/3.8 s) | 21M (4.2 s/1.9 s) |
pbzip2 (4 cores) | 123M (18 s/7 s) | 36M (2.1 s/1.6 s) | 21M (1.3 s/0.7 s) |
lzip | 108M (292 s/8 s) | 33.7M (88 s/2.9 s) | 16.0M (24 s/1.1 s) |
plzip (4 cores) | 109M (84 s/2.4 s) | 33.7M (33 s/1.0 s) | 16.8M (9 s/0.4 s) |
plzip -9 (4 cores) | 103M (180 s/2.4 s) | 33.7M (85 s/2.0 s) | 15.8M (33 s/1.1 s) |
xz -9 -T 4 | 102M (112 s/6 s) | 33.9M (118 s/2.6 s) | 15.8M(24 s/1.0 s) |
7z | 108M (109 s/6 s) | 33.9M (50 s/2.3 s) | 15.6M(13 s/0.9 s) |
zpaq -m 5 | 77M (12 min/*) | 28.4M (178 s/*) | 13.9M (143 s/*) |
cmix (v18) | 47M (17.4 days/*) | 23M(30.3 h/*) | 9.4M (26.5 h/*) |
* decompression takes same time as compression, even for single files in archive.