On 06/29/22 15:36, Richard W.M. Jones wrote:
Because the test previously used error rates of 50%, it could
sometimes "fail to fail". This is noticable if you run the test
repeatedly:
$ while make -C copy check TESTS=copy-nbd-error.sh >& /tmp/log ; do echo -n . ;
done
This now happens more often because of the larger requests made by the
new multi-threaded loop, resulting in fewer calls to the error filter,
so a greater chance that a series of 50% coin tosses will come up all
heads in the test.
Fix this by making the test non-stocastic.
Fixes: commit 8d444b41d09a700c7ee6f9182a649f3f2d325abb
---
copy/copy-nbd-error.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/copy/copy-nbd-error.sh b/copy/copy-nbd-error.sh
index 0088807f54..01524a890c 100755
--- a/copy/copy-nbd-error.sh
+++ b/copy/copy-nbd-error.sh
@@ -40,7 +40,7 @@ $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M
\
# Failure to read should be fatal
echo "Testing read failures on non-sparse source"
$VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M \
- error-pread-rate=0.5 ] null: && fail=1
+ error-pread-rate=1 ] null: && fail=1
# However, reliable block status on a sparse image can avoid the need to read
echo "Testing read failures on sparse source"
@@ -51,7 +51,7 @@ $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error null 5M \
echo "Testing write data failures on arbitrary destination"
$VG nbdcopy -- [ nbdkit --exit-with-parent -v pattern 5M ] \
[ nbdkit --exit-with-parent -v --filter=error --filter=noextents \
- memory 5M error-pwrite-rate=0.5 ] && fail=1
+ memory 5M error-pwrite-rate=1 ] && fail=1
# However, writing zeroes can bypass the need for normal writes
echo "Testing write data failures from sparse source"
Wasn't the original intent of the 50% error rate that the first error
manifest usually at a different offset every time? If we change the
error rate to 1, the tests will fail upon the first access, which kind
of breaks the original intent.
I wonder if we could determine a random offset in advance, and make sure
that the read or write access fails 100%, but only if the request covers
that offset.
...
The probability that n subsequent accesses *don't* fail is
(1-error_rate)^n. (The probability that at least one access fails is
1-(1-error_rate)^n.)
And n is given by (I think?) image_size/request_size. So, if we change
the request_size, we can recalculate "n", for the test not to fail with
the same probability as before.
(1-err1)^(imgsz/rsz1) = (1-err2)^(imgsz/rsz2)
draw the imgsz'th root of both sides
(1-err1)^(1/rsz1) = (1-err2)^(1/rsz2)
raise both sides to the rsz2'nd power
(1-err1)^(rsz2/rsz1) = 1-err2
err2 = 1 - (1-err1)^(rsz2/rsz1)
I know that err1=0.5, but don't know rsz2 and rsz1 (the request sizes
after, and before, the last patch in the series). Assuming (just
guessing!) we increased the request size 8-fold, we'd have to go from
error rate 0.5 to:
err2 = 1 - (1-0.5)^8
= 1 - (1/2)^8
= 1 - (1 / 256)
= 255/256
= 0.99609375
We basically group every eight coin tosses into one super-toss, and want
the latter to show "failure" with the same probability as *at least one*
of the original 8 tosses failing.
Laszlo