9e052883 |
1 | LZMA SDK 22.01\r |
2 | --------------\r |
3 | \r |
4 | LZMA SDK provides the documentation, samples, header files,\r |
5 | libraries, and tools you need to develop applications that \r |
6 | use 7z / LZMA / LZMA2 / XZ compression.\r |
7 | \r |
8 | LZMA is an improved version of famous LZ77 compression algorithm. \r |
9 | It was improved in way of maximum increasing of compression ratio,\r |
10 | keeping high decompression speed and low memory requirements for \r |
11 | decompressing.\r |
12 | \r |
13 | LZMA2 is a LZMA based compression method. LZMA2 provides better \r |
14 | multithreading support for compression than LZMA and some other improvements.\r |
15 | \r |
16 | 7z is a file format for data compression and file archiving.\r |
17 | 7z is a main file format for 7-Zip compression program (www.7-zip.org).\r |
18 | 7z format supports different compression methods: LZMA, LZMA2 and others.\r |
19 | 7z also supports AES-256 based encryption.\r |
20 | \r |
21 | XZ is a file format for data compression that uses LZMA2 compression.\r |
22 | XZ format provides additional features: SHA/CRC check, filters for \r |
23 | improved compression ratio, splitting to blocks and streams,\r |
24 | \r |
25 | \r |
26 | \r |
27 | LICENSE\r |
28 | -------\r |
29 | \r |
30 | LZMA SDK is written and placed in the public domain by Igor Pavlov.\r |
31 | \r |
32 | Some code in LZMA SDK is based on public domain code from another developers:\r |
33 | 1) PPMd var.H (2001): Dmitry Shkarin\r |
34 | 2) SHA-256: Wei Dai (Crypto++ library)\r |
35 | \r |
36 | Anyone is free to copy, modify, publish, use, compile, sell, or distribute the \r |
37 | original LZMA SDK code, either in source code form or as a compiled binary, for \r |
38 | any purpose, commercial or non-commercial, and by any means.\r |
39 | \r |
40 | LZMA SDK code is compatible with open source licenses, for example, you can \r |
41 | include it to GNU GPL or GNU LGPL code.\r |
42 | \r |
43 | \r |
44 | LZMA SDK Contents\r |
45 | -----------------\r |
46 | \r |
47 | Source code:\r |
48 | \r |
49 | - C / C++ / C# / Java - LZMA compression and decompression\r |
50 | - C / C++ - LZMA2 compression and decompression\r |
51 | - C / C++ - XZ compression and decompression\r |
52 | - C - 7z decompression\r |
53 | - C++ - 7z compression and decompression\r |
54 | - C - small SFXs for installers (7z decompression)\r |
55 | - C++ - SFXs and SFXs for installers (7z decompression)\r |
56 | \r |
57 | Precomiled binaries:\r |
58 | \r |
59 | - console programs for lzma / 7z / xz compression and decompression\r |
60 | - SFX modules for installers.\r |
61 | \r |
62 | \r |
63 | UNIX/Linux version \r |
64 | ------------------\r |
65 | There are several otpions to compile 7-Zip with different compilers: gcc and clang.\r |
66 | Also 7-Zip code contains two versions for some critical parts of code: in C and in Assembeler.\r |
67 | So if you compile the version with Assembeler code, you will get faster 7-Zip binary.\r |
68 | \r |
69 | 7-Zip's assembler code uses the following syntax for different platforms:\r |
70 | \r |
71 | 1) x86 and x86-64 (AMD64): MASM syntax. \r |
72 | There are 2 programs that supports MASM syntax in Linux.\r |
73 | ' 'Asmc Macro Assembler and JWasm. But JWasm now doesn't support some \r |
74 | cpu instructions used in 7-Zip.\r |
75 | So you must install Asmc Macro Assembler in Linux, if you want to compile fastest version\r |
76 | of 7-Zip x86 and x86-64:\r |
77 | https://github.com/nidud/asmc\r |
78 | \r |
79 | 2) arm64: GNU assembler for ARM64 with preprocessor. \r |
80 | That systax of that arm64 assembler code in 7-Zip is supported by GCC and CLANG for ARM64.\r |
81 | \r |
82 | There are different binaries that can be compiled from 7-Zip source.\r |
83 | There are 2 main files in folder for compiling:\r |
84 | makefile - that can be used for compiling Windows version of 7-Zip with nmake command\r |
85 | makefile.gcc - that can be used for compiling Linux/macOS versions of 7-Zip with make command\r |
86 | \r |
87 | At first you must change the current folder to folder that contains `makefile.gcc`:\r |
88 | \r |
89 | cd CPP/7zip/Bundles/Alone7z\r |
90 | \r |
91 | Then you can compile `makefile.gcc` with the command:\r |
92 | \r |
93 | make -j -f makefile.gcc\r |
94 | \r |
95 | Also there are additional "*.mak" files in folder "CPP/7zip/" that can be used to compile \r |
96 | 7-Zip binaries with optimized code and optimzing options.\r |
97 | \r |
98 | To compile with GCC without assembler:\r |
99 | cd CPP/7zip/Bundles/Alone7z\r |
100 | make -j -f ../../cmpl_gcc.mak\r |
101 | \r |
102 | To compile with CLANG without assembler:\r |
103 | make -j -f ../../cmpl_clang.mak\r |
104 | \r |
105 | To compile 7-Zip for x86-64 with asmc assembler:\r |
106 | make -j -f ../../cmpl_gcc_x64.mak\r |
107 | \r |
108 | To compile 7-Zip for arm64 with assembler:\r |
109 | make -j -f ../../cmpl_gcc_arm64.mak\r |
110 | \r |
111 | To compile 7-Zip for arm64 for macOS:\r |
112 | make -j -f ../../cmpl_mac_arm64.mak\r |
113 | \r |
114 | Also you can change some compiler options in the mak files:\r |
115 | cmpl_gcc.mak\r |
116 | var_gcc.mak\r |
117 | warn_gcc.mak\r |
118 | \r |
119 | \r |
120 | \r |
121 | Also you can use p7zip (port of 7-Zip for POSIX systems like Unix or Linux):\r |
122 | \r |
123 | http://p7zip.sourceforge.net/\r |
124 | \r |
125 | \r |
126 | Files\r |
127 | -----\r |
128 | \r |
129 | DOC/7zC.txt - 7z ANSI-C Decoder description\r |
130 | DOC/7zFormat.txt - 7z Format description\r |
131 | DOC/installer.txt - information about 7-Zip for installers\r |
132 | DOC/lzma.txt - LZMA compression description\r |
133 | DOC/lzma-sdk.txt - LZMA SDK description (this file)\r |
134 | DOC/lzma-history.txt - history of LZMA SDK\r |
135 | DOC/lzma-specification.txt - Specification of LZMA\r |
136 | DOC/Methods.txt - Compression method IDs for .7z\r |
137 | \r |
138 | bin/installer/ - example script to create installer that uses SFX module,\r |
139 | \r |
140 | bin/7zdec.exe - simplified 7z archive decoder\r |
141 | bin/7zr.exe - 7-Zip console program (reduced version)\r |
142 | bin/x64/7zr.exe - 7-Zip console program (reduced version) (x64 version)\r |
143 | bin/lzma.exe - file->file LZMA encoder/decoder for Windows\r |
144 | bin/7zS2.sfx - small SFX module for installers (GUI version)\r |
145 | bin/7zS2con.sfx - small SFX module for installers (Console version)\r |
146 | bin/7zSD.sfx - SFX module for installers.\r |
147 | \r |
148 | \r |
149 | 7zDec.exe\r |
150 | ---------\r |
151 | 7zDec.exe is simplified 7z archive decoder.\r |
152 | It supports only LZMA, LZMA2, and PPMd methods.\r |
153 | 7zDec decodes whole solid block from 7z archive to RAM.\r |
154 | The RAM consumption can be high.\r |
155 | \r |
156 | \r |
157 | \r |
158 | \r |
159 | Source code structure\r |
160 | ---------------------\r |
161 | \r |
162 | \r |
163 | Asm/ - asm files (optimized code for CRC calculation and Intel-AES encryption)\r |
164 | \r |
165 | C/ - C files (compression / decompression and other)\r |
166 | Util/\r |
167 | 7z - 7z decoder program (decoding 7z files)\r |
168 | Lzma - LZMA program (file->file LZMA encoder/decoder).\r |
169 | LzmaLib - LZMA library (.DLL for Windows)\r |
170 | SfxSetup - small SFX module for installers \r |
171 | \r |
172 | CPP/ -- CPP files\r |
173 | \r |
174 | Common - common files for C++ projects\r |
175 | Windows - common files for Windows related code\r |
176 | \r |
177 | 7zip - files related to 7-Zip\r |
178 | \r |
179 | Archive - files related to archiving\r |
180 | \r |
181 | Common - common files for archive handling\r |
182 | 7z - 7z C++ Encoder/Decoder\r |
183 | \r |
184 | Bundles - Modules that are bundles of other modules (files)\r |
185 | \r |
186 | Alone7z - 7zr.exe: Standalone 7-Zip console program (reduced version)\r |
187 | Format7zExtractR - 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2.\r |
188 | Format7zR - 7zr.dll: Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2\r |
189 | LzmaCon - lzma.exe: LZMA compression/decompression\r |
190 | LzmaSpec - example code for LZMA Specification\r |
191 | SFXCon - 7zCon.sfx: Console 7z SFX module\r |
192 | SFXSetup - 7zS.sfx: 7z SFX module for installers\r |
193 | SFXWin - 7z.sfx: GUI 7z SFX module\r |
194 | \r |
195 | Common - common files for 7-Zip\r |
196 | \r |
197 | Compress - files for compression/decompression\r |
198 | \r |
199 | Crypto - files for encryption / decompression\r |
200 | \r |
201 | UI - User Interface files\r |
202 | \r |
203 | Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll\r |
204 | Common - Common UI files\r |
205 | Console - Code for console program (7z.exe)\r |
206 | Explorer - Some code from 7-Zip Shell extension\r |
207 | FileManager - Some GUI code from 7-Zip File Manager\r |
208 | GUI - Some GUI code from 7-Zip\r |
209 | \r |
210 | \r |
211 | CS/ - C# files\r |
212 | 7zip\r |
213 | Common - some common files for 7-Zip\r |
214 | Compress - files related to compression/decompression\r |
215 | LZ - files related to LZ (Lempel-Ziv) compression algorithm\r |
216 | LZMA - LZMA compression/decompression\r |
217 | LzmaAlone - file->file LZMA compression/decompression\r |
218 | RangeCoder - Range Coder (special code of compression/decompression)\r |
219 | \r |
220 | Java/ - Java files\r |
221 | SevenZip\r |
222 | Compression - files related to compression/decompression\r |
223 | LZ - files related to LZ (Lempel-Ziv) compression algorithm\r |
224 | LZMA - LZMA compression/decompression\r |
225 | RangeCoder - Range Coder (special code of compression/decompression)\r |
226 | \r |
227 | \r |
228 | Note: \r |
229 | Asm / C / C++ source code of LZMA SDK is part of 7-Zip's source code.\r |
230 | 7-Zip's source code can be downloaded from 7-Zip's SourceForge page:\r |
231 | \r |
232 | http://sourceforge.net/projects/sevenzip/\r |
233 | \r |
234 | \r |
235 | \r |
236 | LZMA features\r |
237 | -------------\r |
238 | - Variable dictionary size (up to 1 GB)\r |
239 | - Estimated compressing speed: about 2 MB/s on 2 GHz CPU\r |
240 | - Estimated decompressing speed: \r |
241 | - 20-30 MB/s on modern 2 GHz cpu\r |
242 | - 1-2 MB/s on 200 MHz simple RISC cpu: (ARM, MIPS, PowerPC)\r |
243 | - Small memory requirements for decompressing (16 KB + DictionarySize)\r |
244 | - Small code size for decompressing: 5-8 KB\r |
245 | \r |
246 | LZMA decoder uses only integer operations and can be \r |
247 | implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).\r |
248 | \r |
249 | Some critical operations that affect the speed of LZMA decompression:\r |
250 | 1) 32*16 bit integer multiply\r |
251 | 2) Mispredicted branches (penalty mostly depends from pipeline length)\r |
252 | 3) 32-bit shift and arithmetic operations\r |
253 | \r |
254 | The speed of LZMA decompressing mostly depends from CPU speed.\r |
255 | Memory speed has no big meaning. But if your CPU has small data cache, \r |
256 | overall weight of memory speed will slightly increase.\r |
257 | \r |
258 | \r |
259 | How To Use\r |
260 | ----------\r |
261 | \r |
262 | Using LZMA encoder/decoder executable\r |
263 | --------------------------------------\r |
264 | \r |
265 | Usage: LZMA <e|d> inputFile outputFile [<switches>...]\r |
266 | \r |
267 | e: encode file\r |
268 | \r |
269 | d: decode file\r |
270 | \r |
271 | b: Benchmark. There are two tests: compressing and decompressing \r |
272 | with LZMA method. Benchmark shows rating in MIPS (million \r |
273 | instructions per second). Rating value is calculated from \r |
274 | measured speed and it is normalized with Intel's Core 2 results.\r |
275 | Also Benchmark checks possible hardware errors (RAM \r |
276 | errors in most cases). Benchmark uses these settings:\r |
277 | (-a1, -d21, -fb32, -mfbt4). You can change only -d parameter. \r |
278 | Also you can change the number of iterations. Example for 30 iterations:\r |
279 | LZMA b 30\r |
280 | Default number of iterations is 10.\r |
281 | \r |
282 | <Switches>\r |
283 | \r |
284 | \r |
285 | -a{N}: set compression mode 0 = fast, 1 = normal\r |
286 | default: 1 (normal)\r |
287 | \r |
288 | d{N}: Sets Dictionary size - [0, 30], default: 23 (8MB)\r |
289 | The maximum value for dictionary size is 1 GB = 2^30 bytes.\r |
290 | Dictionary size is calculated as DictionarySize = 2^N bytes. \r |
291 | For decompressing file compressed by LZMA method with dictionary \r |
292 | size D = 2^N you need about D bytes of memory (RAM).\r |
293 | \r |
294 | -fb{N}: set number of fast bytes - [5, 273], default: 128\r |
295 | Usually big number gives a little bit better compression ratio \r |
296 | and slower compression process.\r |
297 | \r |
298 | -lc{N}: set number of literal context bits - [0, 8], default: 3\r |
299 | Sometimes lc=4 gives gain for big files.\r |
300 | \r |
301 | -lp{N}: set number of literal pos bits - [0, 4], default: 0\r |
302 | lp switch is intended for periodical data when period is \r |
303 | equal 2^N. For example, for 32-bit (4 bytes) \r |
304 | periodical data you can use lp=2. Often it's better to set lc0, \r |
305 | if you change lp switch.\r |
306 | \r |
307 | -pb{N}: set number of pos bits - [0, 4], default: 2\r |
308 | pb switch is intended for periodical data \r |
309 | when period is equal 2^N.\r |
310 | \r |
311 | -mf{MF_ID}: set Match Finder. Default: bt4. \r |
312 | Algorithms from hc* group doesn't provide good compression \r |
313 | ratio, but they often works pretty fast in combination with \r |
314 | fast mode (-a0).\r |
315 | \r |
316 | Memory requirements depend from dictionary size \r |
317 | (parameter "d" in table below). \r |
318 | \r |
319 | MF_ID Memory Description\r |
320 | \r |
321 | bt2 d * 9.5 + 4MB Binary Tree with 2 bytes hashing.\r |
322 | bt3 d * 11.5 + 4MB Binary Tree with 3 bytes hashing.\r |
323 | bt4 d * 11.5 + 4MB Binary Tree with 4 bytes hashing.\r |
324 | hc4 d * 7.5 + 4MB Hash Chain with 4 bytes hashing.\r |
325 | \r |
326 | -eos: write End Of Stream marker. By default LZMA doesn't write \r |
327 | eos marker, since LZMA decoder knows uncompressed size \r |
328 | stored in .lzma file header.\r |
329 | \r |
330 | -si: Read data from stdin (it will write End Of Stream marker).\r |
331 | -so: Write data to stdout\r |
332 | \r |
333 | \r |
334 | Examples:\r |
335 | \r |
336 | 1) LZMA e file.bin file.lzma -d16 -lc0 \r |
337 | \r |
338 | compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K) \r |
339 | and 0 literal context bits. -lc0 allows to reduce memory requirements \r |
340 | for decompression.\r |
341 | \r |
342 | \r |
343 | 2) LZMA e file.bin file.lzma -lc0 -lp2\r |
344 | \r |
345 | compresses file.bin to file.lzma with settings suitable \r |
346 | for 32-bit periodical data (for example, ARM or MIPS code).\r |
347 | \r |
348 | 3) LZMA d file.lzma file.bin\r |
349 | \r |
350 | decompresses file.lzma to file.bin.\r |
351 | \r |
352 | \r |
353 | Compression ratio hints\r |
354 | -----------------------\r |
355 | \r |
356 | Recommendations\r |
357 | ---------------\r |
358 | \r |
359 | To increase the compression ratio for LZMA compressing it's desirable \r |
360 | to have aligned data (if it's possible) and also it's desirable to locate\r |
361 | data in such order, where code is grouped in one place and data is \r |
362 | grouped in other place (it's better than such mixing: code, data, code,\r |
363 | data, ...).\r |
364 | \r |
365 | \r |
366 | Filters\r |
367 | -------\r |
368 | You can increase the compression ratio for some data types, using\r |
369 | special filters before compressing. For example, it's possible to \r |
370 | increase the compression ratio on 5-10% for code for those CPU ISAs: \r |
371 | x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.\r |
372 | \r |
373 | You can find C source code of such filters in C/Bra*.* files\r |
374 | \r |
375 | You can check the compression ratio gain of these filters with such \r |
376 | 7-Zip commands (example for ARM code):\r |
377 | No filter:\r |
378 | 7z a a1.7z a.bin -m0=lzma\r |
379 | \r |
380 | With filter for little-endian ARM code:\r |
381 | 7z a a2.7z a.bin -m0=arm -m1=lzma \r |
382 | \r |
383 | It works in such manner:\r |
384 | Compressing = Filter_encoding + LZMA_encoding\r |
385 | Decompressing = LZMA_decoding + Filter_decoding\r |
386 | \r |
387 | Compressing and decompressing speed of such filters is very high,\r |
388 | so it will not increase decompressing time too much.\r |
389 | Moreover, it reduces decompression time for LZMA_decoding, \r |
390 | since compression ratio with filtering is higher.\r |
391 | \r |
392 | These filters convert CALL (calling procedure) instructions \r |
393 | from relative offsets to absolute addresses, so such data becomes more \r |
394 | compressible.\r |
395 | \r |
396 | For some ISAs (for example, for MIPS) it's impossible to get gain from such filter.\r |
397 | \r |
398 | \r |
399 | \r |
400 | ---\r |
401 | \r |
402 | http://www.7-zip.org\r |
403 | http://www.7-zip.org/sdk.html\r |
404 | http://www.7-zip.org/support.html\r |