On Fri, Sep 22, 2023 at 04:03:19PM -0500, Eric Blake wrote:
On Fri, Sep 22, 2023 at 09:47:55PM +0100, Richard W.M. Jones wrote:
> On Fri, Sep 22, 2023 at 12:33:36PM -0500, Eric Blake wrote:
> > There are a number of ways in which gathering information can fail.
> > But when possible, it makes sense to let the server know that we are
> > done talking, as it minimizes the likelihood that nbdinfo's error
> > message will be obscured by an accompanying error message by the
> > server complaining about an unclean disconnect.
> >
> > For example, with a one-off qemu temporarily patched as mentioned in
> > commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off
> > 'qemu-nbd -f raw -r file &' produces:
> >
> > pre-patch:
> >
> > $ ./run nbdinfo --size nbd://localhost
> > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size
9223372036854781440 which does not fit in signed result: Value too large for defined data
type
> > qemu-nbd: option negotiation failed: Failed to read opts magic: Unexpected
end-of-file before all data were read
>
> This doesn't necessarily seem like a bug?
It's a quality of service thing - qemu is just being noisy that the
client closed the socket abruptly without giving any reason why. It
doesn't change libnbd's behavior (nbdinfo has already reported its
error message), but may confuse people reading qemu-nbd logs.
It may also be a case where we could patch qemu-nbd to not output a
complaint if the client exited at a clean point in the back-and-forth
transactions. We still want to be noisy if the socket closes after
the first byte has been read, but if there are zero bytes available,
announcing that the client no longer cares does not add much value.
>
> This feels like quite a significant change in behaviour to me. In
> particular I'm worried if it interacts in some subtle way with the
> forking done by the "[ ... ]" syntax for running servers on the
> command line (or any of the other ways that libnbd might fork/exec).
Interesting observation. atexit() handlers are not preserved across
exec, and I think all our fork() paths end in either exec or _exit
(also where atexit handlers are ignored), so I don't think we are
risking calling the handler twice.
>
> Can we hold this patch until after 1.18 is released and then put it
> in? Should only be a week or two.
Sure, being conservative on this one is fine by me. I'll delay it
until after 1.18.
>
> Provisionally ACKed for post-1.18
Now in as commit fd4f3fea, so we can start getting some soak time under CI.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:
qemu.org |
libguestfs.org