June 2022 - Libguestfs - Libguestfs List Archives

[PATCH libnbd v2 0/3] copy: Use preferred block size for copying

by Richard W.M. Jones

This is not really an evolution of the previous patch set, I started almost from scratch, but for reference version 1 was here: https://listman.redhat.com/archives/libguestfs/2022-June/029301.html This version does appear to work, all tests pass, and I ran the nbdkit test suite as well (which uses nbdcopy in a bunch of places) and that passes also. Note that patch 3 is a generic fix of a test which was previously broken, but now that larger requests are being made by default, it breaks much more often. Rich.

3 years

2
7
0 / 0

[PATCH libnbd 0/2] copy: Use preferred block size when writing

by Richard W.M. Jones

This patch series attempts to fix the well-known problem in nbdcopy that it ignores the preferred block size. It only attempts to fix it for writing (which is the case we care about for virt-v2v). Unfortunately it's subtly incorrect, and in a way that is difficult for me to see how to fix right now. Hence posting it so I can return to it later. [None of the problem description below will make any sense until you look at the patches.] The problem with the second patch is that the assembled blocks are written synchronously on handle[0]. However the handle can be in use at the same time by multi-thread-copying.c, resulting in two poll(2) loops running at the same time trying to consume events. It's very hard to reproduce this problem -- all the tests run fine -- but the following command sometimes demonstrates it: $ nbdkit -U - --filter=checkwrite --filter=offset data '1 @0x100000000 1' offset=1 --run './run nbdcopy "$uri" "$uri"' It will either run normally, deadlock, or give a weird error from the state machine which is characteristic of the two poll loops competing for events: > nbd+unix://?socket=/tmp/nbdkitVOF5BR/socket: nbd_shutdown: nothing to poll for in state REPLY.START: Invalid argument One way to fix this would be to open an extra handle to the destination NBD server for sending these completed blocks. However that breaks various assumptions, and wouldn't work for !multi-conn servers. Another way would be some kind of lock around handle[0], but that seems hard to do given that we're doing asynch operations. Rich.

3 years

1
3
0 / 0

libnbd | Failed pipeline for master | 1ce9bc07

by GitLab

Pipeline #574953125 has failed! Project: libnbd ( https://gitlab.com/nbdkit/libnbd ) Branch: master ( https://gitlab.com/nbdkit/libnbd/-/commits/master ) Commit: 1ce9bc07 ( https://gitlab.com/nbdkit/libnbd/-/commit/1ce9bc07a1e1226fd62cdc1443a5146... ) Commit Message: copy/nbd-ops.c: Fix whitespace for indentation Commit Author: Richard W_M_ Jones ( https://gitlab.com/rwmjones ) Pipeline #574953125 ( https://gitlab.com/nbdkit/libnbd/-/pipelines/574953125 ) triggered by Richard W_M_ Jones ( https://gitlab.com/rwmjones ) had 1 failed job. Job #2650415749 ( https://gitlab.com/nbdkit/libnbd/-/jobs/2650415749/raw ) Stage: builds Name: x86_64-opensuse-leap-153 -- You're receiving this email because of your account on gitlab.com.

3 years

1
0
0 / 0

libnbd | Failed pipeline for master | 764284e7

by GitLab

Pipeline #574232550 has failed! Project: libnbd ( https://gitlab.com/nbdkit/libnbd ) Branch: master ( https://gitlab.com/nbdkit/libnbd/-/commits/master ) Commit: 764284e7 ( https://gitlab.com/nbdkit/libnbd/-/commit/764284e71986081e7a8b6969541aab7... ) Commit Message: lib: Display kTLS status In debug output, for ... Commit Author: Richard W_M_ Jones ( https://gitlab.com/rwmjones ) Pipeline #574232550 ( https://gitlab.com/nbdkit/libnbd/-/pipelines/574232550 ) triggered by Richard W_M_ Jones ( https://gitlab.com/rwmjones ) had 1 failed job. Job #2646449212 ( https://gitlab.com/nbdkit/libnbd/-/jobs/2646449212/raw ) Stage: builds Name: x86_64-opensuse-leap-153 -- You're receiving this email because of your account on gitlab.com.

3 years

1
0
0 / 0

libnbd | Failed pipeline for master | fb187e0e

by GitLab

Pipeline #574083392 has failed! Project: libnbd ( https://gitlab.com/nbdkit/libnbd ) Branch: master ( https://gitlab.com/nbdkit/libnbd/-/commits/master ) Commit: fb187e0e ( https://gitlab.com/nbdkit/libnbd/-/commit/fb187e0eb9bad74e826289bc4ca421d... ) Commit Message: copy: Print program name in some error messages... Commit Author: Richard W_M_ Jones ( https://gitlab.com/rwmjones ) Pipeline #574083392 ( https://gitlab.com/nbdkit/libnbd/-/pipelines/574083392 ) triggered by Richard W_M_ Jones ( https://gitlab.com/rwmjones ) had 1 failed job. Job #2645544534 ( https://gitlab.com/nbdkit/libnbd/-/jobs/2645544534/raw ) Stage: builds Name: x86_64-opensuse-leap-153 -- You're receiving this email because of your account on gitlab.com.

3 years

1
0
0 / 0

libnbd | Failed pipeline for master | fb68c8e9

by GitLab

Pipeline #574018021 has failed! Project: libnbd ( https://gitlab.com/nbdkit/libnbd ) Branch: master ( https://gitlab.com/nbdkit/libnbd/-/commits/master ) Commit: fb68c8e9 ( https://gitlab.com/nbdkit/libnbd/-/commit/fb68c8e9ec42880b266b330160c7c81... ) Commit Message: Version 1.13.3. Commit Author: Richard W_M_ Jones ( https://gitlab.com/rwmjones ) Pipeline #574018021 ( https://gitlab.com/nbdkit/libnbd/-/pipelines/574018021 ) triggered by Richard W_M_ Jones ( https://gitlab.com/rwmjones ) had 1 failed job. Job #2645155030 ( https://gitlab.com/nbdkit/libnbd/-/jobs/2645155030/raw ) Stage: builds Name: x86_64-opensuse-leap-153 -- You're receiving this email because of your account on gitlab.com.

3 years

1
0
0 / 0

nbdkit test fails with: nbdkit: error: /home/rjones/d/nbdkit/tests/test-parallel-sh.script: there seem to be leaked fds, curr_fds=3

by Richard W.M. Jones

Hi Eric, tests/test-parallel-sh.sh fails reliably for me, when run under valgrind: $ make && make -C tests check-valgrind TESTS=test-parallel-sh.sh FAIL: test-parallel-sh.sh The test is kind of complicated, any ideas? Nothing in the code has changed recently as far as I know. qemu-7.0.0-6.fc37.x86_64 glibc-2.35.9000-22.fc37.x86_64 valgrind-3.19.0-3.fc37.x86_64 bash-5.1.16-2.fc36.x86_64 Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com nbdkit - Flexible, fast NBD server with plugins https://gitlab.com/nbdkit/nbdkit

3 years

1
1
0 / 0

[PATCH v3] spec: Clarify BLOCK_STATUS reply details

by Eric Blake

Our docs were inconsistent on whether a NBD_REPLY_TYPE_BLOCK_STATUS reply chunk can exceed the client's requested length, and silent on whether the lengths must be consistent when multiple contexts were negotiated. Clarify this to match existing practice as implemented in qemu-nbd. Clean up some nearby grammatical errors while at it. --- Another round of rewording attempts, based on feedback from Rich on v2. I went ahead and pushed patch 1 and 2 of the v2 series, as they were less controversial. doc/proto.md | 42 ++++++++++++++++++++++++++++-------------- 1 file changed, 28 insertions(+), 14 deletions(-) diff --git a/doc/proto.md b/doc/proto.md index 8a817d2..bacccfa 100644 --- a/doc/proto.md +++ b/doc/proto.md @@ -882,15 +882,25 @@ The procedure works as follows: server supports. - During transmission, a client can then indicate interest in metadata for a given region by way of the `NBD_CMD_BLOCK_STATUS` command, - where *offset* and *length* indicate the area of interest. The - server MUST then respond with the requested information, for all - contexts which were selected during negotiation. For every metadata - context, the server sends one set of extent chunks, where the sizes - of the extents MUST be less than or equal to the length as specified - in the request. Each extent comes with a *flags* field, the - semantics of which are defined by the metadata context. -- A server MUST reply to `NBD_CMD_BLOCK_STATUS` with a structured - reply of type `NBD_REPLY_TYPE_BLOCK_STATUS`. + where *offset* and *length* indicate the area of interest. On + success, the server MUST respond with one structured reply chunk of + type `NBD_REPLY_TYPE_BLOCK_STATUS` per metadata context selected + during negotiation, where each reply chunk is a list of one or more + consecutive extents for that context. Each extent comes with a + *flags* field, the semantics of which are defined by the metadata + context. + +The client's requested *length* is only a hint to the server, so the +cumulative extent length contained in a chunk of the server's reply +may be shorter or longer the original request. When more than one +metadata context was negotiated, the reply chunks for the different +contexts of a single block status request need not have the same +number of extents or cumulative extent length. + +In the request, the client may use the `NBD_CMD_FLAG_REQ_ONE` command +flag to further constrain the server's reply so that each chunk +contains exactly one extent whose length does not exceed the client's +original *length*. A client MUST NOT use `NBD_CMD_BLOCK_STATUS` unless it selected a nonzero number of metadata contexts during negotiation, and used the @@ -1778,8 +1788,8 @@ MUST initiate a hard disconnect. *length* MUST be 4 + (a positive integer multiple of 8). This reply represents a series of consecutive block descriptors where the sum of the length fields within the descriptors is subject to further - constraints documented below. This chunk type MUST appear - exactly once per metadata ID in a structured reply. + constraints documented below. A successful block status request MUST + have exactly one status chunk per negotiated metadata context ID. The payload starts with: @@ -1801,15 +1811,19 @@ MUST initiate a hard disconnect. *length* of the final extent MAY result in a sum larger than the original requested length, if the server has that information anyway as a side effect of reporting the status of the requested region. + When multiple metadata contexts are negotiated, the reply chunks for + the different contexts need not have the same number of extents or + cumulative extent length. Even if the client did not use the `NBD_CMD_FLAG_REQ_ONE` flag in its request, the server MAY return fewer descriptors in the reply than would be required to fully specify the whole range of requested information to the client, if looking up the information would be too resource-intensive for the server, so long as at least one - extent is returned. Servers should however be aware that most - clients implementations will then simply ask for the next extent - instead. + extent is returned. Servers should however be aware that most + client implementations will likely follow up with a request for + extent information at the first offset not covered by a + reduced-length reply. All error chunk types have bit 15 set, and begin with the same *error*, *message length*, and optional *message* fields as -- 2.35.1

3 years

3
5
0 / 0

Re: [Libguestfs] nbdkit error: "write reply: NBD_CMD_WRITE: Broken pipe"

by Wouter Verhelst

Same story. On Fri, Jun 17, 2022 at 01:09:25PM +0200, Wouter Verhelst wrote: > Hi, and sorry for the delay (I was overseas for a month in May to visit family > etc) > > On Tue, May 03, 2022 at 09:07:17AM +0100, Richard W.M. Jones wrote: > > On Mon, May 02, 2022 at 03:36:33PM +0100, Nikolaus Rath wrote: > > > So I tried to reproduce this, and noticed something odd. It seems I can > > > disconnect the nbd device (nbd-client -d) while there are still requests > > > in flight: > > > > > > May 02 15:20:50 vostro.rath.org kernel: nbd1: detected capacity change from 0 to 52428800 > > > May 02 15:20:50 vostro.rath.org kernel: block nbd1: NBD_DISCONNECT > > > May 02 15:20:50 vostro.rath.org kernel: block nbd1: Disconnected due to user request. > > > May 02 15:20:50 vostro.rath.org kernel: block nbd1: shutting down sockets > > > May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 776 op 0x0:(READ) flags 0x80700 phys_seg 29 prio class 0 > > > May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 776 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > > > May 02 15:20:50 vostro.rath.org kernel: Buffer I/O error on dev nbd1, logical block 97, async page read > > > May 02 15:20:50 vostro.rath.org kernel: block nbd1: Attempted send on invalid socket > > > May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 > > > May 02 15:20:50 vostro.rath.org kernel: block nbd1: Attempted send on invalid socket > > > May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 > > > > > > This was generated by running: > > > > > > $ nbd-client localhost /dev/nbd1 && mkfs.ext4 /dev/nbd1 && nbd-client -d > > > /dev/nbd1 > > > > > > Is that expected behavior? > > > > It's a bit unexpected to me. Adding Wouter to the thread - he might > > have an idea here, especially if there's a way to have "nbd-client -d" > > wait for pending requests to finish before disconnecting. > > > > I don't use the kernel client very much myself. We mostly use either > > libnbd or the qemu client. > > The kernel is supposed to deal with this. If there are still processes > using the device and/or, nbd-client -d should receive an error (I believe it > was EPERM). If there are outstanding writes, the disconnect should flush > those, wait for them to return, and *then* handle the disconnect. > > At least those are the assumptions made in nbd-client ;-) > > (I notice you did send this to the kernel maintainer too, but just > wanted to follow up on what nbd-client assumes) > > -- > w(a)uter.{be,co.za} > wouter(a){grep.be,fosdem.org,debian.org} -- w(a)uter.{be,co.za} wouter(a){grep.be,fosdem.org,debian.org}

3 years

1
0
0 / 0

Re: [Libguestfs] Kernel driver I/O block size hinting

by Wouter Verhelst

Sorry for the late reply. I just noticed that my mail config was borked; I was happily sending out emails, but none of them reached anyone :-/ Fixed now. On Fri, Jun 17, 2022 at 12:59:04PM +0200, Wouter Verhelst wrote: > Hi, > > On Tue, Jun 14, 2022 at 03:38:19PM +0100, Richard W.M. Jones wrote: > > This is a follow-up to this thread: > > > > https://listman.redhat.com/archives/libguestfs/2022-June/thread.html#29210 > > > > about getting the kernel client (nbd.ko) to obey block size > > constraints sent by the NBD server: > > > > https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md#block-... > > > > I was sent this very interesting design document about the original > > intent behind the kernel's I/O limits: > > > > https://people.redhat.com/msnitzer/docs/io-limits.txt > > > > There are four or five kernel block layer settings we could usefully > > adjust, and there are three NBD block size constraints, and in my > > opinion there's not a very clear mapping between them. But I'll have > > a go at what I think we should do. > > > > - - - > > > > (1) Kernel physical_block_size & logical_block_size: The example given > > is of a hard disk with 4K physical sectors (AF) which can nevertheless > > emulate 512-byte sectors. In this case you'd set physical_block_size > > = 4K, logical_block_size = 512b. > > > > Data structures (partition tables, etc) should be aligned to > > physical_block_size to avoid unnecessary RMW cycles. But the > > fundamental until of I/O is logical_block_size. > > > > Current behaviour of nbd.ko is that logical_block_size == > > physical_block_size == the nbd-client "-b" option (default: 512 bytes, > > contradicting the documentation). > > Whoops, indeed. Fixed in git. > > > I think we should set logical_block_size == physical_block_size == > > MAX (512, NBD minimum block size constraint). > > > > What should happen to the nbd-client -b option? > > I believe it remains useful to have an override for exceptional > situations. I think I'll leave it (but we can provide an appropriate > warning about this possibly being a bad idea in the man page) > > It might be useful to extend the syntax to specify more than one block > size, given that there are going to be multiple ones now. > > > (2) Kernel minimum_io_size: The documentation says this is the > > "preferred minimum unit for random I/O". > > > > Current behaviour of nbd.ko is this is not set. > > > > I think the NBD's preferred block size should map to minimum_io_size. > > > > > > (3) Kernel optimal_io_size: The documentation says this is the > > "[preferred] streaming I/O [size]". > > > > Current behaviour of nbd.ko is this is not set. > > > > NBD doesn't really have the concept of streaming vs random I/O, so we > > could either ignore this or set it to the same value as > > minimum_io_size. > > > > I have a kernel patch allowing nbd-client to set both minimum_io_size > > and optimal_io_size from userspace. > > > > > > (4) Kernel blk_queue_max_hw_sectors: This is documented as: "set max > > sectors for a request ... Enables a low level driver to set a hard > > upper limit, max_hw_sectors, on the size of requests." > > > > Current behaviour of nbd.ko is that we set this to 65536 (sectors? > > blocks?), which for 512b sectors is 32M. > > > > I think we could set this to MIN (32M, NBD maximum block size constraint), > > converting the result to sectors. > > > > - - - > > > > What do people think? > > Yes, this all looks reasonable to me. Thanks. > > -- > w(a)uter.{be,co.za} > wouter(a){grep.be,fosdem.org,debian.org} -- w(a)uter.{be,co.za} wouter(a){grep.be,fosdem.org,debian.org}

3 years

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Libguestfs June 2022