[adding qemu-block in cc]
On Sat, Jul 13, 2024 at 03:40:36PM GMT, Richard W.M. Jones wrote:
This is expanding on the commit message I wrote here:
https://gitlab.com/nbdkit/nbdkit/-/commit/780599d2e77c7cc4c1a7e99d0a93328...
A simple "one-liner" to test if NBD block size preferences are passed
correctly through qemu and into a Linux guest is this:
$ nbdkit memory 1G --filter=blocksize-policy \
blocksize-minimum=4096 \
blocksize-preferred=65536 \
blocksize-maximum=8M \
--run '
LIBGUESTFS_HV=/path/to/qemu-system-x86_64 \
LIBGUESTFS_BACKEND=direct \
guestfish --format=raw -a "$uri" \
run : \
debug sh "head -1 /sys/block/*/queue/*_io_size" : \
debug sh "for d in /dev/sd? ; do sg_inq -p 0xb0 \$d ; done" \
'
Current qemu (9.0.0) does not pass the block size preferences
correctly. It's a problem in qemu, not in Linux.
qemu's NBD client requests the block size preferences from nbdkit and
reads them correctly. I verified this by adding some print statements
into nbd/client.c. The sizes are stored in BDRVNBDState 'info' field.
qemu's virtio-scsi driver *can* present a block limits VPD page (0xb0)
containing these limits (see hw/scsi/scsi-disk.c), and Linux is able
to see the contents of this page using tools like 'sg_inq'. Linux
appears to translate the information faithfully into
/sys/block/sdX/queue/{minimum,optimal}_io_size files.
However the virtio-scsi driver in qemu populates this information from
the qemu command line (-device [...]min_io_size=512,opt_io_size=4096).
It doesn't pass the information through from the NBD source backing
the drive.
Is guestfish the one synthesizing the '-device min_io_size=512' used
by qemu? I don't see it in the nbdkit command line posted above. Or
is guestfish leaving it up to qemu to advertise its defaults, and this
is merely a case of qemu favoring its defaults over what the device
advertised?
Fixing this seems like a non-trivial amount of work.
Indeed, if guestfish is passing command-line defaults for qemu to use,
we have to determine when to prioritize hardware advertisements over
command-line defaults, while still maintaining flexibility to
intentionally pick different sizes than what hardware advertised for
the purposes of performance testing.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:
qemu.org |
libguestfs.org