git subrepo pull (merge) --force deps/libchdr
[pcsx_rearmed.git] / deps / libchdr / deps / zstd-1.5.6 / doc / decompressor_errata.md
CommitLineData
648db22b 1Decompressor Errata
2===================
3
4This document captures known decompressor bugs, where the decompressor rejects a valid zstd frame.
5Each entry will contain:
61. The last affected decompressor versions.
72. The decompressor components affected.
82. Whether the compressed frame could ever be produced by the reference compressor.
f535537f 93. An example frame (hexadecimal string when it can be short enough, link to golden file otherwise)
648db22b 104. A description of the bug.
11
12The document is in reverse chronological order, with the bugs that affect the most recent zstd decompressor versions listed first.
13
14
f535537f 15No sequence using the 2-bytes format
16------------------------------------------------
17
18**Last affected version**: v1.5.5
19
20**Affected decompressor component(s)**: Library & CLI
21
22**Produced by the reference compressor**: No
23
24**Example Frame**: see zstd/tests/golden-decompression/zeroSeq_2B.zst
25
26The zstd decoder incorrectly expects FSE tables when there are 0 sequences present in the block
27if the value 0 is encoded using the 2-bytes format.
28Instead, it should immediately end the sequence section, and move on to next block.
29
30This situation was never generated by the reference compressor,
31because representing 0 sequences with the 2-bytes format is inefficient
32(the 1-byte format is always used in this case).
33
34
35Compressed block with a size of exactly 128 KB
36------------------------------------------------
37
38**Last affected version**: v1.5.2
39
40**Affected decompressor component(s)**: Library & CLI
41
42**Produced by the reference compressor**: No
43
44**Example Frame**: see zstd/tests/golden-decompression/block-128k.zst
45
46The zstd decoder incorrectly rejected blocks of type `Compressed_Block` when their size was exactly 128 KB.
47Note that `128 KB - 1` was accepted, and `128 KB + 1` is forbidden by the spec.
48
49This type of block was never generated by the reference compressor.
50
51These blocks used to be disallowed by the spec up until spec version 0.3.2 when the restriction was lifted by [PR#1689](https://github.com/facebook/zstd/pull/1689).
52
53> A Compressed_Block has the extra restriction that Block_Size is always strictly less than the decompressed size. If this condition cannot be respected, the block must be sent uncompressed instead (Raw_Block).
54
55
648db22b 56Compressed block with 0 literals and 0 sequences
57------------------------------------------------
58
59**Last affected version**: v1.5.2
60
61**Affected decompressor component(s)**: Library & CLI
62
63**Produced by the reference compressor**: No
64
65**Example Frame**: `28b5 2ffd 2000 1500 0000 00`
66
67The zstd decoder incorrectly rejected blocks of type `Compressed_Block` that encodes literals as `Raw_Literals_Block` with no literals, and has no sequences.
68
69This type of block was never generated by the reference compressor.
70
71Additionally, these blocks were disallowed by the spec up until spec version 0.3.2 when the restriction was lifted by [PR#1689](https://github.com/facebook/zstd/pull/1689).
72
73> A Compressed_Block has the extra restriction that Block_Size is always strictly less than the decompressed size. If this condition cannot be respected, the block must be sent uncompressed instead (Raw_Block).
74
f535537f 75
648db22b 76First block is RLE block
77------------------------
78
79**Last affected version**: v1.4.3
80
81**Affected decompressor component(s)**: CLI only
82
83**Produced by the reference compressor**: No
84
85**Example Frame**: `28b5 2ffd a001 0002 0002 0010 000b 0000 00`
86
87The zstd CLI decompressor rejected cases where the first block was an RLE block whose `Block_Size` is 131072, and the frame contains more than one block.
88This only affected the zstd CLI, and not the library.
89
90The example is an RLE block with 131072 bytes, followed by a second RLE block with 1 byte.
91
92The compressor currently works around this limitation by explicitly avoiding producing RLE blocks as the first
93block.
94
95https://github.com/facebook/zstd/blob/8814aa5bfa74f05a86e55e9d508da177a893ceeb/lib/compress/zstd_compress.c#L3527-L3535
96
f535537f 97
648db22b 98Tiny FSE Table & Block
99----------------------
100
101**Last affected version**: v1.3.4
102
103**Affected decompressor component(s)**: Library & CLI
104
105**Produced by the reference compressor**: Possibly until version v1.3.4, but probably never
106
107**Example Frame**: `28b5 2ffd 2027 c500 0080 f3f1 f0ec ebc6 c5c7 f09d 4300 0000 e0e0 0658 0100 603e 52`
108
109The zstd library rejected blocks of type `Compressed_Block` whose offset of the last table with type `FSE_Compressed_Mode` was less than 4 bytes from the end of the block.
110
111In more depth, let `Last_Table_Offset` be the offset in the compressed block (excluding the header) that
112the last table with type `FSE_Compressed_Mode` started. If `Block_Content - Last_Table_Offset < 4` then
113the buggy zstd decompressor would reject the block. This occurs when the last serialized table is 2 bytes
114and the bitstream size is 1 byte.
115
116For example:
117* There is 1 sequence in the block
118* `Literals_Lengths_Mode` is `FSE_Compressed_Mode` & the serialized table size is 2 bytes
119* `Offsets_Mode` is `Predefined_Mode`
120* `Match_Lengths_Mode` is `Predefined_Mode`
121* The bitstream is 1 byte. E.g. there is only one sequence and it fits in 1 byte.
122
123The total `Block_Content` is `5` bytes, and `Last_Table_Offset` is `2`.
124
125See the compressor workaround code:
126
127https://github.com/facebook/zstd/blob/8814aa5bfa74f05a86e55e9d508da177a893ceeb/lib/compress/zstd_compress.c#L2667-L2682
f535537f 128
129Magicless format
130----------------------
131
132**Last affected version**: v1.5.5
133
134**Affected decompressor component(s)**: Library
135
136**Produced by the reference compressor**: Yes (example: https://gist.github.com/embg/9940726094f4cf2cef162cffe9319232)
137
138**Example Frame**: `27 b5 2f fd 00 03 19 00 00 66 6f 6f 3f ba c4 59`
139
140v1.5.6 fixes several bugs in which the magicless-format decoder rejects valid frames.
141These include but are not limited to:
142* Valid frames that happen to begin with a legacy magic number (little-endian)
143* Valid frames that happen to begin with a skippable magic number (little-endian)
144
145If you are affected by this issue and cannot update to v1.5.6 or later, there is a
146workaround to recover affected data. Simply prepend the ZSTD magic number
147`0xFD2FB528` (little-endian) to your data and decompress using the standard-format
148decoder.