On Sat, May 23, 2020 at 08:06:54AM +0100, Richard W.M. Jones via Libc-alpha wrote:
The context to this is that nbdkit uses sscanf to parse simple file
formats in various places, eg:
https://github.com/libguestfs/nbdkit/blob/b23f4f53cf71326f1dba481f64f7f18...
https://github.com/libguestfs/nbdkit/blob/b23f4f53cf71326f1dba481f64f7f18...
We can only do this safely where we can prove that overflow does not
matter.
Being that it's specified as UB, it can never "not matter";
arbitrarily bad side effects are permitted. So it's only safe to use
scanf where the input is *trusted not to contain overflowing values*.
What would be really nice to fix here is getting the standard to
specify that overflow has behavior like strto* or at least
"unspecified value" rather than "undefined behavior" so that it's
safe
to let it overflow in cases where you don't care (e.g. you'll be
consistency-checking the value afterwards anyway).
In other cases we've had to change sscanf uses to strto* etc
which is much more difficult to use correctly. Just look at how much
code is required to wrap strto* functions to use them safely:
https://github.com/libguestfs/nbdkit/blob/b23f4f53cf71326f1dba481f64f7f18...
Really that much code is just for the sake of verbose error messages,
and they're not even accurate. "errno!=0" does not mean "could not
parse"; it can also be overflow of a perfectly parseable value. And if
you've already caught errno!=0 then end==str is impossible (dead
code). The last case, not hitting null, is also likely spurious/wrong;
you usually *want* to pick up where strto* stopped, and the next thing
the parser does will catch whether the characters after the number are
valid there or not.
strto* do have some annoying design flaws in error reporting, but
they're not really that hard to use right, and much easier than scanf
which just *lacks the reporting channels* for the kind of fine-grained
error reporting you're insisting on doing here.
FWIW this code would also be a lot cleaner as a static inline function
rather than a many-line macro.
Rich