ce188d4d |
1 | LZMA SDK 16.04\r |
2 | --------------\r |
3 | \r |
4 | LZMA SDK provides the documentation, samples, header files,\r |
5 | libraries, and tools you need to develop applications that \r |
6 | use 7z / LZMA / LZMA2 / XZ compression.\r |
7 | \r |
8 | LZMA is an improved version of famous LZ77 compression algorithm. \r |
9 | It was improved in way of maximum increasing of compression ratio,\r |
10 | keeping high decompression speed and low memory requirements for \r |
11 | decompressing.\r |
12 | \r |
13 | LZMA2 is a LZMA based compression method. LZMA2 provides better \r |
14 | multithreading support for compression than LZMA and some other improvements.\r |
15 | \r |
16 | 7z is a file format for data compression and file archiving.\r |
17 | 7z is a main file format for 7-Zip compression program (www.7-zip.org).\r |
18 | 7z format supports different compression methods: LZMA, LZMA2 and others.\r |
19 | 7z also supports AES-256 based encryption.\r |
20 | \r |
21 | XZ is a file format for data compression that uses LZMA2 compression.\r |
22 | XZ format provides additional features: SHA/CRC check, filters for \r |
23 | improved compression ratio, splitting to blocks and streams,\r |
24 | \r |
25 | \r |
26 | \r |
27 | LICENSE\r |
28 | -------\r |
29 | \r |
30 | LZMA SDK is written and placed in the public domain by Igor Pavlov.\r |
31 | \r |
32 | Some code in LZMA SDK is based on public domain code from another developers:\r |
33 | 1) PPMd var.H (2001): Dmitry Shkarin\r |
34 | 2) SHA-256: Wei Dai (Crypto++ library)\r |
35 | \r |
36 | Anyone is free to copy, modify, publish, use, compile, sell, or distribute the \r |
37 | original LZMA SDK code, either in source code form or as a compiled binary, for \r |
38 | any purpose, commercial or non-commercial, and by any means.\r |
39 | \r |
40 | LZMA SDK code is compatible with open source licenses, for example, you can \r |
41 | include it to GNU GPL or GNU LGPL code.\r |
42 | \r |
43 | \r |
44 | LZMA SDK Contents\r |
45 | -----------------\r |
46 | \r |
47 | Source code:\r |
48 | \r |
49 | - C / C++ / C# / Java - LZMA compression and decompression\r |
50 | - C / C++ - LZMA2 compression and decompression\r |
51 | - C / C++ - XZ compression and decompression\r |
52 | - C - 7z decompression\r |
53 | - C++ - 7z compression and decompression\r |
54 | - C - small SFXs for installers (7z decompression)\r |
55 | - C++ - SFXs and SFXs for installers (7z decompression)\r |
56 | \r |
57 | Precomiled binaries:\r |
58 | \r |
59 | - console programs for lzma / 7z / xz compression and decompression\r |
60 | - SFX modules for installers.\r |
61 | \r |
62 | \r |
63 | UNIX/Linux version \r |
64 | ------------------\r |
65 | To compile C++ version of file->file LZMA encoding, go to directory\r |
66 | CPP/7zip/Bundles/LzmaCon\r |
67 | and call make to recompile it:\r |
68 | make -f makefile.gcc clean all\r |
69 | \r |
70 | In some UNIX/Linux versions you must compile LZMA with static libraries.\r |
71 | To compile with static libraries, you can use \r |
72 | LIB = -lm -static\r |
73 | \r |
74 | Also you can use p7zip (port of 7-Zip for POSIX systems like Unix or Linux):\r |
75 | \r |
76 | http://p7zip.sourceforge.net/\r |
77 | \r |
78 | \r |
79 | Files\r |
80 | -----\r |
81 | \r |
82 | DOC/7zC.txt - 7z ANSI-C Decoder description\r |
83 | DOC/7zFormat.txt - 7z Format description\r |
84 | DOC/installer.txt - information about 7-Zip for installers\r |
85 | DOC/lzma.txt - LZMA compression description\r |
86 | DOC/lzma-sdk.txt - LZMA SDK description (this file)\r |
87 | DOC/lzma-history.txt - history of LZMA SDK\r |
88 | DOC/lzma-specification.txt - Specification of LZMA\r |
89 | DOC/Methods.txt - Compression method IDs for .7z\r |
90 | \r |
91 | bin/installer/ - example script to create installer that uses SFX module,\r |
92 | \r |
93 | bin/7zdec.exe - simplified 7z archive decoder\r |
94 | bin/7zr.exe - 7-Zip console program (reduced version)\r |
95 | bin/x64/7zr.exe - 7-Zip console program (reduced version) (x64 version)\r |
96 | bin/lzma.exe - file->file LZMA encoder/decoder for Windows\r |
97 | bin/7zS2.sfx - small SFX module for installers (GUI version)\r |
98 | bin/7zS2con.sfx - small SFX module for installers (Console version)\r |
99 | bin/7zSD.sfx - SFX module for installers.\r |
100 | \r |
101 | \r |
102 | 7zDec.exe\r |
103 | ---------\r |
104 | 7zDec.exe is simplified 7z archive decoder.\r |
105 | It supports only LZMA, LZMA2, and PPMd methods.\r |
106 | 7zDec decodes whole solid block from 7z archive to RAM.\r |
107 | The RAM consumption can be high.\r |
108 | \r |
109 | \r |
110 | \r |
111 | \r |
112 | Source code structure\r |
113 | ---------------------\r |
114 | \r |
115 | \r |
116 | Asm/ - asm files (optimized code for CRC calculation and Intel-AES encryption)\r |
117 | \r |
118 | C/ - C files (compression / decompression and other)\r |
119 | Util/\r |
120 | 7z - 7z decoder program (decoding 7z files)\r |
121 | Lzma - LZMA program (file->file LZMA encoder/decoder).\r |
122 | LzmaLib - LZMA library (.DLL for Windows)\r |
123 | SfxSetup - small SFX module for installers \r |
124 | \r |
125 | CPP/ -- CPP files\r |
126 | \r |
127 | Common - common files for C++ projects\r |
128 | Windows - common files for Windows related code\r |
129 | \r |
130 | 7zip - files related to 7-Zip\r |
131 | \r |
132 | Archive - files related to archiving\r |
133 | \r |
134 | Common - common files for archive handling\r |
135 | 7z - 7z C++ Encoder/Decoder\r |
136 | \r |
137 | Bundles - Modules that are bundles of other modules (files)\r |
138 | \r |
139 | Alone7z - 7zr.exe: Standalone 7-Zip console program (reduced version)\r |
140 | Format7zExtractR - 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2.\r |
141 | Format7zR - 7zr.dll: Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2\r |
142 | LzmaCon - lzma.exe: LZMA compression/decompression\r |
143 | LzmaSpec - example code for LZMA Specification\r |
144 | SFXCon - 7zCon.sfx: Console 7z SFX module\r |
145 | SFXSetup - 7zS.sfx: 7z SFX module for installers\r |
146 | SFXWin - 7z.sfx: GUI 7z SFX module\r |
147 | \r |
148 | Common - common files for 7-Zip\r |
149 | \r |
150 | Compress - files for compression/decompression\r |
151 | \r |
152 | Crypto - files for encryption / decompression\r |
153 | \r |
154 | UI - User Interface files\r |
155 | \r |
156 | Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll\r |
157 | Common - Common UI files\r |
158 | Console - Code for console program (7z.exe)\r |
159 | Explorer - Some code from 7-Zip Shell extension\r |
160 | FileManager - Some GUI code from 7-Zip File Manager\r |
161 | GUI - Some GUI code from 7-Zip\r |
162 | \r |
163 | \r |
164 | CS/ - C# files\r |
165 | 7zip\r |
166 | Common - some common files for 7-Zip\r |
167 | Compress - files related to compression/decompression\r |
168 | LZ - files related to LZ (Lempel-Ziv) compression algorithm\r |
169 | LZMA - LZMA compression/decompression\r |
170 | LzmaAlone - file->file LZMA compression/decompression\r |
171 | RangeCoder - Range Coder (special code of compression/decompression)\r |
172 | \r |
173 | Java/ - Java files\r |
174 | SevenZip\r |
175 | Compression - files related to compression/decompression\r |
176 | LZ - files related to LZ (Lempel-Ziv) compression algorithm\r |
177 | LZMA - LZMA compression/decompression\r |
178 | RangeCoder - Range Coder (special code of compression/decompression)\r |
179 | \r |
180 | \r |
181 | Note: \r |
182 | Asm / C / C++ source code of LZMA SDK is part of 7-Zip's source code.\r |
183 | 7-Zip's source code can be downloaded from 7-Zip's SourceForge page:\r |
184 | \r |
185 | http://sourceforge.net/projects/sevenzip/\r |
186 | \r |
187 | \r |
188 | \r |
189 | LZMA features\r |
190 | -------------\r |
191 | - Variable dictionary size (up to 1 GB)\r |
192 | - Estimated compressing speed: about 2 MB/s on 2 GHz CPU\r |
193 | - Estimated decompressing speed: \r |
194 | - 20-30 MB/s on modern 2 GHz cpu\r |
195 | - 1-2 MB/s on 200 MHz simple RISC cpu: (ARM, MIPS, PowerPC)\r |
196 | - Small memory requirements for decompressing (16 KB + DictionarySize)\r |
197 | - Small code size for decompressing: 5-8 KB\r |
198 | \r |
199 | LZMA decoder uses only integer operations and can be \r |
200 | implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).\r |
201 | \r |
202 | Some critical operations that affect the speed of LZMA decompression:\r |
203 | 1) 32*16 bit integer multiply\r |
204 | 2) Mispredicted branches (penalty mostly depends from pipeline length)\r |
205 | 3) 32-bit shift and arithmetic operations\r |
206 | \r |
207 | The speed of LZMA decompressing mostly depends from CPU speed.\r |
208 | Memory speed has no big meaning. But if your CPU has small data cache, \r |
209 | overall weight of memory speed will slightly increase.\r |
210 | \r |
211 | \r |
212 | How To Use\r |
213 | ----------\r |
214 | \r |
215 | Using LZMA encoder/decoder executable\r |
216 | --------------------------------------\r |
217 | \r |
218 | Usage: LZMA <e|d> inputFile outputFile [<switches>...]\r |
219 | \r |
220 | e: encode file\r |
221 | \r |
222 | d: decode file\r |
223 | \r |
224 | b: Benchmark. There are two tests: compressing and decompressing \r |
225 | with LZMA method. Benchmark shows rating in MIPS (million \r |
226 | instructions per second). Rating value is calculated from \r |
227 | measured speed and it is normalized with Intel's Core 2 results.\r |
228 | Also Benchmark checks possible hardware errors (RAM \r |
229 | errors in most cases). Benchmark uses these settings:\r |
230 | (-a1, -d21, -fb32, -mfbt4). You can change only -d parameter. \r |
231 | Also you can change the number of iterations. Example for 30 iterations:\r |
232 | LZMA b 30\r |
233 | Default number of iterations is 10.\r |
234 | \r |
235 | <Switches>\r |
236 | \r |
237 | \r |
238 | -a{N}: set compression mode 0 = fast, 1 = normal\r |
239 | default: 1 (normal)\r |
240 | \r |
241 | d{N}: Sets Dictionary size - [0, 30], default: 23 (8MB)\r |
242 | The maximum value for dictionary size is 1 GB = 2^30 bytes.\r |
243 | Dictionary size is calculated as DictionarySize = 2^N bytes. \r |
244 | For decompressing file compressed by LZMA method with dictionary \r |
245 | size D = 2^N you need about D bytes of memory (RAM).\r |
246 | \r |
247 | -fb{N}: set number of fast bytes - [5, 273], default: 128\r |
248 | Usually big number gives a little bit better compression ratio \r |
249 | and slower compression process.\r |
250 | \r |
251 | -lc{N}: set number of literal context bits - [0, 8], default: 3\r |
252 | Sometimes lc=4 gives gain for big files.\r |
253 | \r |
254 | -lp{N}: set number of literal pos bits - [0, 4], default: 0\r |
255 | lp switch is intended for periodical data when period is \r |
256 | equal 2^N. For example, for 32-bit (4 bytes) \r |
257 | periodical data you can use lp=2. Often it's better to set lc0, \r |
258 | if you change lp switch.\r |
259 | \r |
260 | -pb{N}: set number of pos bits - [0, 4], default: 2\r |
261 | pb switch is intended for periodical data \r |
262 | when period is equal 2^N.\r |
263 | \r |
264 | -mf{MF_ID}: set Match Finder. Default: bt4. \r |
265 | Algorithms from hc* group doesn't provide good compression \r |
266 | ratio, but they often works pretty fast in combination with \r |
267 | fast mode (-a0).\r |
268 | \r |
269 | Memory requirements depend from dictionary size \r |
270 | (parameter "d" in table below). \r |
271 | \r |
272 | MF_ID Memory Description\r |
273 | \r |
274 | bt2 d * 9.5 + 4MB Binary Tree with 2 bytes hashing.\r |
275 | bt3 d * 11.5 + 4MB Binary Tree with 3 bytes hashing.\r |
276 | bt4 d * 11.5 + 4MB Binary Tree with 4 bytes hashing.\r |
277 | hc4 d * 7.5 + 4MB Hash Chain with 4 bytes hashing.\r |
278 | \r |
279 | -eos: write End Of Stream marker. By default LZMA doesn't write \r |
280 | eos marker, since LZMA decoder knows uncompressed size \r |
281 | stored in .lzma file header.\r |
282 | \r |
283 | -si: Read data from stdin (it will write End Of Stream marker).\r |
284 | -so: Write data to stdout\r |
285 | \r |
286 | \r |
287 | Examples:\r |
288 | \r |
289 | 1) LZMA e file.bin file.lzma -d16 -lc0 \r |
290 | \r |
291 | compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K) \r |
292 | and 0 literal context bits. -lc0 allows to reduce memory requirements \r |
293 | for decompression.\r |
294 | \r |
295 | \r |
296 | 2) LZMA e file.bin file.lzma -lc0 -lp2\r |
297 | \r |
298 | compresses file.bin to file.lzma with settings suitable \r |
299 | for 32-bit periodical data (for example, ARM or MIPS code).\r |
300 | \r |
301 | 3) LZMA d file.lzma file.bin\r |
302 | \r |
303 | decompresses file.lzma to file.bin.\r |
304 | \r |
305 | \r |
306 | Compression ratio hints\r |
307 | -----------------------\r |
308 | \r |
309 | Recommendations\r |
310 | ---------------\r |
311 | \r |
312 | To increase the compression ratio for LZMA compressing it's desirable \r |
313 | to have aligned data (if it's possible) and also it's desirable to locate\r |
314 | data in such order, where code is grouped in one place and data is \r |
315 | grouped in other place (it's better than such mixing: code, data, code,\r |
316 | data, ...).\r |
317 | \r |
318 | \r |
319 | Filters\r |
320 | -------\r |
321 | You can increase the compression ratio for some data types, using\r |
322 | special filters before compressing. For example, it's possible to \r |
323 | increase the compression ratio on 5-10% for code for those CPU ISAs: \r |
324 | x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.\r |
325 | \r |
326 | You can find C source code of such filters in C/Bra*.* files\r |
327 | \r |
328 | You can check the compression ratio gain of these filters with such \r |
329 | 7-Zip commands (example for ARM code):\r |
330 | No filter:\r |
331 | 7z a a1.7z a.bin -m0=lzma\r |
332 | \r |
333 | With filter for little-endian ARM code:\r |
334 | 7z a a2.7z a.bin -m0=arm -m1=lzma \r |
335 | \r |
336 | It works in such manner:\r |
337 | Compressing = Filter_encoding + LZMA_encoding\r |
338 | Decompressing = LZMA_decoding + Filter_decoding\r |
339 | \r |
340 | Compressing and decompressing speed of such filters is very high,\r |
341 | so it will not increase decompressing time too much.\r |
342 | Moreover, it reduces decompression time for LZMA_decoding, \r |
343 | since compression ratio with filtering is higher.\r |
344 | \r |
345 | These filters convert CALL (calling procedure) instructions \r |
346 | from relative offsets to absolute addresses, so such data becomes more \r |
347 | compressible.\r |
348 | \r |
349 | For some ISAs (for example, for MIPS) it's impossible to get gain from such filter.\r |
350 | \r |
351 | \r |
352 | \r |
353 | ---\r |
354 | \r |
355 | http://www.7-zip.org\r |
356 | http://www.7-zip.org/sdk.html\r |
357 | http://www.7-zip.org/support.html\r |