On Fri, Oct 07, 2022 at 01:51:05PM +0100, Richard W.M. Jones wrote:
> Thinking about ways to expose even more code-paths, I wonder if
we
> could tweak the client along the lines of:
>
> if (rand () & 1)
> nbd_set_handshake_flags (nbd, rand ());
> if (rand () & 1)
> nbd_set_strict_mode (nbd, rand ());
Adding randomization to the fuzzer is a bad idea I'm afraid,
specifically called out in the docs:
https://aflplus.plus/docs/faq/ (search for "Stability")
Interesting reading, including:
"There are functions that are unstable, but also provide value to
coverage, e.g., init functions that use fuzz data as input. If,
however, a function that has nothing to do with the input data is the
source of instability, e.g., checking jitter, or is a hash map
function etc., then it should not be instrumented."
So using rand() is probably going to hurt more than it helps (too
unpredictable; even if seeded, you can only fuzz the seed number, not
the psuedo-random sequence that follows from that seed). But setting
up a mode where we tweak our first few handshaking decisions based on
reading a few bytes from a fuzzed file may be worthwhile - where the
fuzzer can then explore changes it makes to that file as a way of
deterministically exploring different initialization paths.
> and so forth, to allow the fuzzer to explore different combinations of
> settings.
The fuzzer will explore different paths by presenting different
inputs. In the case of libnbd, "input" means the network data that
normally libnbd would be reading from the NBD server. As long as
variations in those replies (inputs) can cause libnbd to take
different paths then the fuzzer will eventually explore those paths.
> Another idea might be:
>
> static void do_opt_structured_reply (void)
> { /* call nbd_opt_structured_reply() */ }
> static void do_opt_list_meta_context (void)
> { /* call nbd_opt_list_meta_context[_queries]() */ }
> ...
> void (*opts[])(void) = {
> do_opt_structured_reply,
> do_opt_list_meta_context,
> ...
> };
>
> for (i = rand () % 20; i > 0; i--)
> opts[i % ARRAY_SIZE (opts)] ();
>
> to play with different handshake sequences.
This won't work for the same reason.
Okay, I see that better after looking at README; the fuzzing we are
attempting is based on a two-step process: first we generate an actual
capture of server replies to a valid client session, then you start
fuzzing on a replay of the client dealing with slight variations of
the server's reply (that is, trying to find spots where a
buggy/malicious server can trip up the client). Throwing in more
randomness to the initialization would let us create exponentially
more starting point files in the first step of capturing actual client
sessions, but may not necesesarily drive us any closer to the second
step of fuzzing the server's replies into tickling client bugs unless
we can tightly correlate which input sequence of the first step
determines which output file we should be fuzzing in the second step.
There may still be some fuzzing gains to be added, but it would be by
having yet another file under the fuzzer's control to read from during
initialization, and not by calls to rand().
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization:
qemu.org |
libvirt.org