On Mon, Jul 15, 2024 at 09:44:08AM -0500, Eric Blake wrote:
[adding qemu-block in cc]
On Sat, Jul 13, 2024 at 03:40:36PM GMT, Richard W.M. Jones wrote:
> This is expanding on the commit message I wrote here:
>
>
https://gitlab.com/nbdkit/nbdkit/-/commit/780599d2e77c7cc4c1a7e99d0a93328...
>
> A simple "one-liner" to test if NBD block size preferences are passed
> correctly through qemu and into a Linux guest is this:
>
> $ nbdkit memory 1G --filter=blocksize-policy \
> blocksize-minimum=4096 \
> blocksize-preferred=65536 \
> blocksize-maximum=8M \
> --run '
> LIBGUESTFS_HV=/path/to/qemu-system-x86_64 \
> LIBGUESTFS_BACKEND=direct \
> guestfish --format=raw -a "$uri" \
> run : \
> debug sh "head -1 /sys/block/*/queue/*_io_size" : \
> debug sh "for d in /dev/sd? ; do sg_inq -p 0xb0 \$d ; done" \
> '
>
> Current qemu (9.0.0) does not pass the block size preferences
> correctly. It's a problem in qemu, not in Linux.
>
> qemu's NBD client requests the block size preferences from nbdkit and
> reads them correctly. I verified this by adding some print statements
> into nbd/client.c. The sizes are stored in BDRVNBDState 'info' field.
>
> qemu's virtio-scsi driver *can* present a block limits VPD page (0xb0)
> containing these limits (see hw/scsi/scsi-disk.c), and Linux is able
> to see the contents of this page using tools like 'sg_inq'. Linux
> appears to translate the information faithfully into
> /sys/block/sdX/queue/{minimum,optimal}_io_size files.
>
> However the virtio-scsi driver in qemu populates this information from
> the qemu command line (-device [...]min_io_size=512,opt_io_size=4096).
> It doesn't pass the information through from the NBD source backing
> the drive.
Is guestfish the one synthesizing the '-device min_io_size=512' used
by qemu?
We don't add those parameters because we want it passed through from
the backing NBD device. We do generate the -drive/-device parameter
(or do it via libvirt). Here's an example to make this clearer:
$ nbdkit memory 1G --run '
LIBGUESTFS_BACKEND=direct guestfish -vx --format=raw -a "$uri" run
'
...
/usr/bin/qemu-kvm \
...
-drive
file=nbd:unix:/tmp/nbdkitiJ5lz0/socket,cache=writeback,format=raw,id=hd0,if=none \
-device scsi-hd,drive=hd0 \
...
I don't see it in the nbdkit command line posted above. Or
is guestfish leaving it up to qemu to advertise its defaults, and this
is merely a case of qemu favoring its defaults over what the device
advertised?
qemu's NBD client fetches the I/O size preferences from nbdkit and
stores them in BDRVNBDState->info, but they get no further than that
inside qemu. I expected they'd be passed through to the virtio-scsi
device and from there into the guest.
> Fixing this seems like a non-trivial amount of work.
Indeed, if guestfish is passing command-line defaults for qemu to use,
we have to determine when to prioritize hardware advertisements over
command-line defaults, while still maintaining flexibility to
intentionally pick different sizes than what hardware advertised for
the purposes of performance testing.
We're not setting anything on the command line.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages.
http://libguestfs.org