648db22b |
1 | # Fuzzing |
2 | |
3 | Each fuzzing target can be built with multiple engines. |
4 | Zstd provides a fuzz corpus for each target that can be downloaded with |
5 | the command: |
6 | |
7 | ``` |
8 | make corpora |
9 | ``` |
10 | |
11 | It will download each corpus into `./corpora/TARGET`. |
12 | |
13 | ## fuzz.py |
14 | |
15 | `fuzz.py` is a helper script for building and running fuzzers. |
16 | Run `./fuzz.py -h` for the commands and run `./fuzz.py COMMAND -h` for |
17 | command specific help. |
18 | |
19 | ### Generating Data |
20 | |
21 | `fuzz.py` provides a utility to generate seed data for each fuzzer. |
22 | |
23 | ``` |
24 | make -C ../tests decodecorpus |
25 | ./fuzz.py gen TARGET |
26 | ``` |
27 | |
28 | By default it outputs 100 samples, each at most 8KB into `corpora/TARGET-seed`, |
29 | but that can be configured with the `--number`, `--max-size-log` and `--seed` |
30 | flags. |
31 | |
32 | ### Build |
33 | It respects the usual build environment variables `CC`, `CFLAGS`, etc. |
34 | The environment variables can be overridden with the corresponding flags |
35 | `--cc`, `--cflags`, etc. |
36 | The specific fuzzing engine is selected with `LIB_FUZZING_ENGINE` or |
37 | `--lib-fuzzing-engine`, the default is `libregression.a`. |
38 | Alternatively, you can use Clang's built in fuzzing engine with |
39 | `--enable-fuzzer`. |
40 | It has flags that can easily set up sanitizers `--enable-{a,ub,m}san`, and |
41 | coverage instrumentation `--enable-coverage`. |
42 | It sets sane defaults which can be overridden with flags `--debug`, |
43 | `--enable-ubsan-pointer-overflow`, etc. |
44 | Run `./fuzz.py build -h` for help. |
45 | |
46 | ### Running Fuzzers |
47 | |
48 | `./fuzz.py` can run `libfuzzer`, `afl`, and `regression` tests. |
49 | See the help of the relevant command for options. |
50 | Flags not parsed by `fuzz.py` are passed to the fuzzing engine. |
51 | The command used to run the fuzzer is printed for debugging. |
52 | |
53 | Here's a helpful command to fuzz each target across all cores, |
54 | stopping only if a bug is found: |
55 | ``` |
56 | for target in $(./fuzz.py list); do |
57 | ./fuzz.py libfuzzer $target -jobs=10 -workers=10 -max_total_time=1000 || break; |
58 | done |
59 | ``` |
60 | Alternatively, you can fuzz all targets in parallel, using one core per target: |
61 | ``` |
62 | python3 ./fuzz.py list | xargs -P$(python3 ./fuzz.py list | wc -l) -I__ sh -c "python3 ./fuzz.py libfuzzer __ 2>&1 | tee __.log" |
63 | ``` |
64 | Either way, to double-check that no crashes were found, run `ls corpora/*crash`. |
65 | If any crashes were found, you can use the hashes to reproduce them. |
66 | |
67 | ## LibFuzzer |
68 | |
69 | ``` |
70 | # Build the fuzz targets |
71 | ./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan --cc clang --cxx clang++ |
72 | # OR equivalently |
73 | CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan |
74 | # Run the fuzzer |
75 | ./fuzz.py libfuzzer TARGET <libfuzzer args like -jobs=4> |
76 | ``` |
77 | |
78 | where `TARGET` could be `simple_decompress`, `stream_round_trip`, etc. |
79 | |
80 | ### MSAN |
81 | |
82 | Fuzzing with `libFuzzer` and `MSAN` is as easy as: |
83 | |
84 | ``` |
85 | CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-msan |
86 | ./fuzz.py libfuzzer TARGET <libfuzzer args> |
87 | ``` |
88 | |
89 | `fuzz.py` respects the environment variables / flags `MSAN_EXTRA_CPPFLAGS`, |
90 | `MSAN_EXTRA_CFLAGS`, `MSAN_EXTRA_CXXFLAGS`, `MSAN_EXTRA_LDFLAGS` to easily pass |
91 | the extra parameters only for MSAN. |
92 | |
93 | ## AFL |
94 | |
95 | The default `LIB_FUZZING_ENGINE` is `libregression.a`, which produces a binary |
96 | that AFL can use. |
97 | |
98 | ``` |
99 | # Build the fuzz targets |
100 | CC=afl-clang CXX=afl-clang++ ./fuzz.py build all --enable-asan --enable-ubsan |
101 | # Run the fuzzer without a memory limit because of ASAN |
102 | ./fuzz.py afl TARGET -m none |
103 | ``` |
104 | |
105 | ## Regression Testing |
106 | |
107 | The regression test supports the `all` target to run all the fuzzers in one |
108 | command. |
109 | |
110 | ``` |
111 | CC=clang CXX=clang++ ./fuzz.py build all --enable-asan --enable-ubsan |
112 | ./fuzz.py regression all |
113 | CC=clang CXX=clang++ ./fuzz.py build all --enable-msan |
114 | ./fuzz.py regression all |
115 | ``` |
116 | |
117 | ## Fuzzing a custom sequence producer plugin |
118 | Sequence producer plugin authors can use the zstd fuzzers to stress-test their code. |
119 | See the documentation in `fuzz_third_party_seq_prod.h` for details. |
f535537f |
120 | |
121 | ## Adding a new fuzzer |
122 | There are several steps involved in adding a new fuzzer harness. |
123 | |
124 | ### Build your harness |
125 | 1. Create a new your fuzzer harness `tests/fuzz/your_harness.c`. |
126 | |
127 | 2. Add your harness to the Makefile |
128 | |
129 | 2.1 Follow [this example](https://github.com/facebook/zstd/blob/e124e39301381de8f323436a3e4c46539747ba24/tests/fuzz/Makefile#L216) if your fuzzer requires both compression and decompression symbols (prefix `rt_`). If your fuzzer only requires decompression symbols, follow [this example](https://github.com/facebook/zstd/blob/6a0052a409e2604bd40354b76b86272b712edd7d/tests/fuzz/Makefile#L194) (prefix `d_`). |
130 | |
131 | 2.2 Add your target to [`FUZZ_TARGETS`](https://github.com/facebook/zstd/blob/6a0052a409e2604bd40354b76b86272b712edd7d/tests/fuzz/Makefile#L108). |
132 | |
133 | 3. Add your harness to [`fuzz.py`](https://github.com/facebook/zstd/blob/6a0052a409e2604bd40354b76b86272b712edd7d/tests/fuzz/fuzz.py#L48). |
134 | |
135 | ### Generate seed data |
136 | Follow the instructions above to generate seed data: |
137 | ``` |
138 | make -C ../tests decodecorpus |
139 | ./fuzz.py gen your_harness |
140 | ``` |
141 | |
142 | ### Run the harness |
143 | Follow the instructions above to run your harness and fix any crashes: |
144 | ``` |
145 | ./fuzz.py build your_harness --enable-fuzzer --enable-asan --enable-ubsan --cc clang --cxx clang++ |
146 | ./fuzz.py libfuzzer your_harness |
147 | ``` |
148 | |
149 | ### Minimize and zip the corpus |
150 | After running the fuzzer for a while, you will have a large corpus at `tests/fuzz/corpora/your_harness*`. |
151 | This corpus must be minimized and zipped before uploading to GitHub for regression testing: |
152 | ``` |
153 | ./fuzz.py minimize your_harness |
154 | ./fuzz.py zip your_harness |
155 | ``` |
156 | |
157 | ### Upload the zip file to GitHub |
158 | The previous step should produce a `.zip` file containing the corpus for your new harness. |
159 | This corpus must be uploaded to GitHub here: https://github.com/facebook/zstd/releases/tag/fuzz-corpora |
160 | |
161 | |