Re: [Libguestfs] [PATCH nbdkit 1/5] Add new public nbdkit_parse_probability function

Wednesday, 17 May 2023

On 5/16/23 22:22, Richard W.M. Jones wrote:
...
 On Tue, May 16, 2023 at 02:35:38PM -0500, Eric Blake wrote: 
...
> Our pre-existing error filter is already doing percentage
calculations
> via floating point, so on the grounds that this patch is consolidating
> common code patterns to avoid reimplementation, we're good. But as you
> say, representing things as a ratio between two integers may be more
> maintainable than floating point where accuracy matters, or this may
> be a case where we are explicit that use of the floating point
> fractions as a percentage is intentionally liable to rounding issues.
> Still, avoiding undefined behavior when the division can overflow
> (whether that be the overflow to floating point infinity or even the
> overflow beyond UINT64_MAX), does make it worth questioning if
> floating point is our best representation, or at a minimum if we want
> to be stricter in what we parse before passing a substring on to
> strtod() so that we have more control over the values (part of my 19
> patches to qemu was explicitly truncating any string at 'e' or 'E'
so
> that strtod() could not do an exponent parse that would inadverently
> hit overflow to INFINITY).

 I'm not really sure we reached a conclusion so far, but I did want to
 say that I changed the evil filter impl so that it now treats any
 0 <= P < 1e-12 as effectively 0.  (P == evil-probability; P < 0 and P > 1
 are still rejected at config time as before).

 With P == 1e-12:

 nbdkit: debug: evil: mode: stuck-bits, P: 1e-12
 nbdkit: debug: evil: block_size: 17592186044416 (2**44)
 nbdkit: debug: evil: expected bits per block: 140.737

 (The large block size isn't an issue here as nothing is allocated.
 However it would be a problem if the block_size calculation overflows,
 and I believe that limiting P to larger values avoids this.) 
Right, this is about the direction I could agree with -- and again I
want to emphasize that you don't *need* my agreement to proceed with the
floating point approach!

I think enforcing a minimum for P could be useful too, for wherever we
divide by P.

...
 Also I'm not sure that handling probabilities as a numerator and
 denominator actually helps here?  Both still need to be stored as
 floats, so it just pushes the same problem to the plugin. 
No, the point with N/D is that they're both positive integers (well, N
may be zero as well), and that they *together* give you a single
rational probability, namely N/D. They'd be stored as separate unsigned
integer fields everywhere, and the division would only be performed (in
unsigned integer) wherever directly needed, such as in (a * N) / D.

I don't know if this is compatible with existent code. If we already
have code that parses floats (and Eric's comments imply we do), then
that code couldn't be rebased on this N/D representation, compatibly.
But, I'd consider introducing a new (in a way, "divergent") N/D
representation a fair price, for not adding more floating point!
Refactoring (extracting common code) is only useful if we can fix the
extracted code later on, and that's a tough challenge with floating point.

Laszlo

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [PATCH nbdkit 1/5] Add new public nbdkit_parse_probability function