On Sat, Oct 23, 2021 at 01:25:21AM +0300, Nir Soffer wrote:
I see a rare random failure when calling BlockStatus via Go binding:
block_status: nbd_block_status: poll: Interrupted system call
I could not reproduce this with "nbdinfo --map", even after modifying it
to call nbd_block_status() for every 128 MiB.
Fixing this in nbd_unlock_poll() avoids this issue in the entire
library, when we wait for command completion. This seems more useful
that fixing it in all libnbd clients.
Tested using a go client listing all extents in an image, calling
BlockStatus for every 128m with fedora 34 qcow2 image. Without this fix,
this was always failing.
$ hyperfine -r1000 --show-output "./client nbd+unix://?socket=/tmp/nbd.sock >
/dev/null"
Benchmark 1: ./client nbd+unix://?socket=/tmp/nbd.sock > /dev/null
Time (mean ± σ): 31.6 ms ± 3.1 ms [User: 8.8 ms, System: 7.2 ms]
Range (min … max): 26.1 ms … 52.3 ms 1000 runs
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
lib/poll.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/lib/poll.c b/lib/poll.c
index edfcc59..df01d94 100644
--- a/lib/poll.c
+++ b/lib/poll.c
@@ -57,8 +57,11 @@ nbd_unlocked_poll (struct nbd_handle *h, int timeout)
* would allow other threads to close file descriptors which we have
* passed to poll.
*/
- r = poll (fds, 1, timeout);
- debug (h, "poll end: r=%d revents=%x", r, fds[0].revents);
+ do {
+ r = poll (fds, 1, timeout);
+ debug (h, "poll end: r=%d revents=%x", r, fds[0].revents);
+ } while (r == -1 && errno == EINTR);
+
if (r == -1) {
set_error (errno, "poll");
return -1;
I think this is a perfectly reasonable solution, and better than
modifying the language bindings. I guess Eric should have the final
word here, but from my point of view ACK.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html