Re: [Libguestfs] [nbdkit PATCH] retry: Tweak error message

Wednesday, 1 April 2020

On 4/1/20 5:12 AM, Richard W.M. Jones wrote:
...
 On Tue, Mar 31, 2020 at 04:00:22PM -0500, Eric Blake wrote:
> The retry filter defaults to 5 retries, but when run with verbose
> details produces some confusing output:
>
> $ rm -f /tmp/inject; (sleep 5s; touch /tmp/inject)& ./nbdkit -f -v -U - \
>    null 1G --filter=retry --filter=noextents --filter=error error-rate=100% \
>    error-file=/tmp/inject --filter=delay rdelay=1 \
>    --run 'qemu-img convert $nbd out.img'
> ...
> nbdkit: null[1]: debug: noextents: pread count=2097152 offset=14680064
> nbdkit: null[1]: debug: error: pread count=2097152 offset=14680064
> nbdkit: null[1]: error: injecting EIO error into pread
> nbdkit: null[1]: debug: retry 6: original errno = 5
> nbdkit: null[1]: debug: could not recover after 5 retries
> 
...
>    again:
>     /* Log the original errno since it will be lost when we retry. */
> -  nbdkit_debug ("retry %d: original errno = %d", data->retry+1, *err);
> +  nbdkit_debug ("retry attempt %d: original errno = %d", data->retry,
*err);
>
>     if (data->retry >= retries) {
>       nbdkit_debug ("could not recover after %d retries", retries);

 I think adding 1 was correct, so that the retries count from 1. 
But the problem is the timing of when the messages are printed.  And 
remember, this code is only reached after a failed attempt, not on 
success.  Pre-patch, we have:

initial try succeeds - no message

or (here, with retries limited to 2):

initial try fails:
  retry 1: original error
delay
retry 1 fails:
  retry 2: original error
delay
retry 2 fails:
  retry 3: original error
  could not recover after 2 retries

...

 How about moving the message to make it conditional instead,
 like this?

 again:
    /* Log the original errno since it will be lost when we retry. */
    if (data->retry < retries)
      nbdkit_debug ("retry %d: original errno = %d", data->retry+1, *err);
    else {
       nbdkit_debug ("could not recover after %d retries", retries);
    ... 
with your suggestion, we'd have:

initial try fails:
  retry 1: original error
delay
retry 1 fails:
  retry 2: original error
delay
retry 2 fails:
  could not recover after 2 retries

where we lost the log of the errno of retry 2.  Maybe better would be:

initial try fails:
  attempt 0 failed with errno %d, starting delay for retry 1
delay
retry 1 fails:
  attempt 1 failed with errno %d, starting delay for retry 2
delay
retry 2 fails:
  attempt 2 failed with errno %d, could not recover

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [nbdkit PATCH] retry: Tweak error message