On Sat, May 23, 2020 at 09:28:26AM -0700, Paul Eggert wrote:
On 5/23/20 9:11 AM, Rich Felker wrote:
> stopping on an initial prefix ... does not admit easily sharing a backend with
strto*.
I don't see why. If the backend has a "stop scanning on integer overflow"
flag
(which it would need to have anyway, to support the proposed behavior), then
*scanf can use the flag and strto* can not use it.
Anyway, this is not an issue for glibc, which has no such backend.
It's relevant because you want to propose this for standardization.
> that's contrary to the abstract behavior defined for scanf
> (matching fields syntactically then value conversion)
That's not really a problem. The abstract behavior already provides for matching
that is not purely syntactic. For example, string conversion specifiers can
impose length limits on the match, which means the matching does not rely purely
on the syntax of the input. It would be easy to say that integer conversion
specifiers can also impose limits related to integer overflow.
Sure that's syntax. It's /[^ ]{1,n}"/.
Of course for integers you can define a syntax that matches every
non-overflowing value (this is always true for finite matching sets),
but that's nothing like how the function is specified and I don't
think anyone reasonable would classify non-overflow as a syntactic
property.
> It's also even *more
> likely* to break programs that don't expect the behavior than just
> storing a wrapped or clamped value
That's not true of the code that I looked at (see the URLs earlier in this
thread). That code was pretty carefully written and yet still vulnerable to the
integer-overflow issue.
I don't follow. *Any* use of scanf on untrusted input is "vulnerable
to the integer-overflow issue" in the sense that overflow is UB. This
is not something subtle.
If you mean actually using overflowed values in an unsafe way
(assuming no ballooning effects of UB, just wrong values), I don't see
how it's subtle either. Any value that could be produced via overflow
could also be produced via non-overflowing input, and you have to
validate data either way.
> I'm pretty sure the real answer here is just "don't
use *scanf for
> that."
Absolutely true right now. We are merely talking about (a) what sort of
implementation behavior is more useful for programs that are currently relying
on undefined behavior, and (b) what might be the cleanest addition to POSIX
later, to help improve this mess so that future programmers can use *scanf
safely in more situations.
This is absolutely not "clean" and I am opposed to it.
Rich