| 1 | Decompressor Permissiveness to Invalid Data |
| 2 | =========================================== |
| 3 | |
| 4 | This document describes the behavior of the reference decompressor in cases |
| 5 | where it accepts formally invalid data instead of reporting an error. |
| 6 | |
| 7 | While the reference decompressor *must* decode any compliant frame following |
| 8 | the specification, its ability to detect erroneous data is on a best effort |
| 9 | basis: the decoder may accept input data that would be formally invalid, |
| 10 | when it causes no risk to the decoder, and which detection would cost too much |
| 11 | complexity or speed regression. |
| 12 | |
| 13 | In practice, the vast majority of invalid data are detected, if only because |
| 14 | many corruption events are dangerous for the decoder process (such as |
| 15 | requesting an out-of-bound memory access) and many more are easy to check. |
| 16 | |
| 17 | This document lists a few known cases where invalid data was formerly accepted |
| 18 | by the decoder, and what has changed since. |
| 19 | |
| 20 | |
| 21 | Offset == 0 |
| 22 | ----------- |
| 23 | |
| 24 | **Last affected version**: v1.5.5 |
| 25 | |
| 26 | **Produced by the reference compressor**: No |
| 27 | |
| 28 | **Example Frame**: `28b5 2ffd 0000 4500 0008 0002 002f 430b ae` |
| 29 | |
| 30 | If a sequence is decoded with `literals_length = 0` and `offset_value = 3` |
| 31 | while `Repeated_Offset_1 = 1`, the computed offset will be `0`, which is |
| 32 | invalid. |
| 33 | |
| 34 | The reference decompressor up to v1.5.5 processes this case as if the computed |
| 35 | offset was `1`, including inserting `1` into the repeated offset list. |
| 36 | This prevents the output buffer from remaining uninitialized, thus denying a |
| 37 | potential attack vector from an untrusted source. |
| 38 | However, in the rare case where this scenario would be the outcome of a |
| 39 | transmission or storage error, the decoder relies on the checksum to detect |
| 40 | the error. |
| 41 | |
| 42 | In newer versions, this case is always detected and reported as a corruption error. |
| 43 | |
| 44 | |
| 45 | Non-zeroes reserved bits |
| 46 | ------------------------ |
| 47 | |
| 48 | **Last affected version**: v1.5.5 |
| 49 | |
| 50 | **Produced by the reference compressor**: No |
| 51 | |
| 52 | The Sequences section of each block has a header, and one of its elements is a |
| 53 | byte, which describes the compression mode of each symbol. |
| 54 | This byte contains 2 reserved bits which must be set to zero. |
| 55 | |
| 56 | The reference decompressor up to v1.5.5 just ignores these 2 bits. |
| 57 | This behavior has no consequence for the rest of the frame decoding process. |
| 58 | |
| 59 | In newer versions, the 2 reserved bits are actively checked for value zero, |
| 60 | and the decoder reports a corruption error if they are not. |