Re: [Libguestfs] [PATCH libnbd v3 3/3] copy/copy-nbd-error.sh: Make this test non-stochastic

Wednesday, 29 June 2022

On 06/29/22 15:36, Richard W.M. Jones wrote:
...
 Because the test previously used error rates of 50%, it could
 sometimes "fail to fail".  This is noticable if you run the test
 repeatedly:

 $ while make -C copy check TESTS=copy-nbd-error.sh >& /tmp/log ; do echo -n . ;
done

 This now happens more often because of the larger requests made by the
 new multi-threaded loop, resulting in fewer calls to the error filter,
 so a greater chance that a series of 50% coin tosses will come up all
 heads in the test.

 Fix this by making the test non-stocastic.

 Fixes: commit 8d444b41d09a700c7ee6f9182a649f3f2d325abb
 ---
  copy/copy-nbd-error.sh | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/copy/copy-nbd-error.sh b/copy/copy-nbd-error.sh
 index 0088807f54..01524a890c 100755
 --- a/copy/copy-nbd-error.sh
 +++ b/copy/copy-nbd-error.sh
 @@ -40,7 +40,7 @@ $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M
\
  # Failure to read should be fatal
  echo "Testing read failures on non-sparse source"
  $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M \
 -    error-pread-rate=0.5 ] null: && fail=1
 +    error-pread-rate=1 ] null: && fail=1

  # However, reliable block status on a sparse image can avoid the need to read
  echo "Testing read failures on sparse source"
 @@ -51,7 +51,7 @@ $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error null 5M \
  echo "Testing write data failures on arbitrary destination"
  $VG nbdcopy -- [ nbdkit --exit-with-parent -v pattern 5M ] \
      [ nbdkit --exit-with-parent -v --filter=error --filter=noextents \
 -        memory 5M error-pwrite-rate=0.5 ] && fail=1
 +        memory 5M error-pwrite-rate=1 ] && fail=1

  # However, writing zeroes can bypass the need for normal writes
  echo "Testing write data failures from sparse source"

Wasn't the original intent of the 50% error rate that the first error
manifest usually at a different offset every time? If we change the
error rate to 1, the tests will fail upon the first access, which kind
of breaks the original intent.

I wonder if we could determine a random offset in advance, and make sure
that the read or write access fails 100%, but only if the request covers
that offset.

...

The probability that n subsequent accesses *don't* fail is
(1-error_rate)^n. (The probability that at least one access fails is
1-(1-error_rate)^n.)

And n is given by (I think?) image_size/request_size. So, if we change
the request_size, we can recalculate "n", for the test not to fail with
the same probability as before.

  (1-err1)^(imgsz/rsz1) = (1-err2)^(imgsz/rsz2)

draw the imgsz'th root of both sides

  (1-err1)^(1/rsz1) = (1-err2)^(1/rsz2)

raise both sides to the rsz2'nd power

  (1-err1)^(rsz2/rsz1) = 1-err2

  err2 = 1 - (1-err1)^(rsz2/rsz1)

I know that err1=0.5, but don't know rsz2 and rsz1 (the request sizes
after, and before, the last patch in the series). Assuming (just
guessing!) we increased the request size 8-fold, we'd have to go from
error rate 0.5 to:

  err2 = 1 - (1-0.5)^8
       = 1 - (1/2)^8
       = 1 - (1 / 256)
       = 255/256
       = 0.99609375

We basically group every eight coin tosses into one super-toss, and want
the latter to show "failure" with the same probability as *at least one*
of the original 8 tosses failing.

Laszlo

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [PATCH libnbd v3 3/3] copy/copy-nbd-error.sh: Make this test non-stochastic