This is a follow-up to this thread:
https://listman.redhat.com/archives/libguestfs/2022-June/thread.html#29210
about getting the kernel client (nbd.ko) to obey block size
constraints sent by the NBD server:
https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md#block-...
I was sent this very interesting design document about the original
intent behind the kernel's I/O limits:
https://people.redhat.com/msnitzer/docs/io-limits.txt
There are four or five kernel block layer settings we could usefully
adjust, and there are three NBD block size constraints, and in my
opinion there's not a very clear mapping between them. But I'll have
a go at what I think we should do.
- - -
(1) Kernel physical_block_size & logical_block_size: The example given
is of a hard disk with 4K physical sectors (AF) which can nevertheless
emulate 512-byte sectors. In this case you'd set physical_block_size
= 4K, logical_block_size = 512b.
Data structures (partition tables, etc) should be aligned to
physical_block_size to avoid unnecessary RMW cycles. But the
fundamental until of I/O is logical_block_size.
Current behaviour of nbd.ko is that logical_block_size ==
physical_block_size == the nbd-client "-b" option (default: 512 bytes,
contradicting the documentation).
I think we should set logical_block_size == physical_block_size ==
MAX (512, NBD minimum block size constraint).
What should happen to the nbd-client -b option?
(2) Kernel minimum_io_size: The documentation says this is the
"preferred minimum unit for random I/O".
Current behaviour of nbd.ko is this is not set.
I think the NBD's preferred block size should map to minimum_io_size.
(3) Kernel optimal_io_size: The documentation says this is the
"[preferred] streaming I/O [size]".
Current behaviour of nbd.ko is this is not set.
NBD doesn't really have the concept of streaming vs random I/O, so we
could either ignore this or set it to the same value as
minimum_io_size.
I have a kernel patch allowing nbd-client to set both minimum_io_size
and optimal_io_size from userspace.
(4) Kernel blk_queue_max_hw_sectors: This is documented as: "set max
sectors for a request ... Enables a low level driver to set a hard
upper limit, max_hw_sectors, on the size of requests."
Current behaviour of nbd.ko is that we set this to 65536 (sectors?
blocks?), which for 512b sectors is 32M.
I think we could set this to MIN (32M, NBD maximum block size constraint),
converting the result to sectors.
- - -
What do people think?
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html