Sorry for the late reply.
I just noticed that my mail config was borked; I was happily sending out
emails, but none of them reached anyone :-/
Fixed now.
On Fri, Jun 17, 2022 at 12:59:04PM +0200, Wouter Verhelst wrote:
Hi,
On Tue, Jun 14, 2022 at 03:38:19PM +0100, Richard W.M. Jones wrote:
> This is a follow-up to this thread:
>
>
https://listman.redhat.com/archives/libguestfs/2022-June/thread.html#29210
>
> about getting the kernel client (nbd.ko) to obey block size
> constraints sent by the NBD server:
>
>
https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md#block-...
>
> I was sent this very interesting design document about the original
> intent behind the kernel's I/O limits:
>
>
https://people.redhat.com/msnitzer/docs/io-limits.txt
>
> There are four or five kernel block layer settings we could usefully
> adjust, and there are three NBD block size constraints, and in my
> opinion there's not a very clear mapping between them. But I'll have
> a go at what I think we should do.
>
> - - -
>
> (1) Kernel physical_block_size & logical_block_size: The example given
> is of a hard disk with 4K physical sectors (AF) which can nevertheless
> emulate 512-byte sectors. In this case you'd set physical_block_size
> = 4K, logical_block_size = 512b.
>
> Data structures (partition tables, etc) should be aligned to
> physical_block_size to avoid unnecessary RMW cycles. But the
> fundamental until of I/O is logical_block_size.
>
> Current behaviour of nbd.ko is that logical_block_size ==
> physical_block_size == the nbd-client "-b" option (default: 512 bytes,
> contradicting the documentation).
Whoops, indeed. Fixed in git.
> I think we should set logical_block_size == physical_block_size ==
> MAX (512, NBD minimum block size constraint).
>
> What should happen to the nbd-client -b option?
I believe it remains useful to have an override for exceptional
situations. I think I'll leave it (but we can provide an appropriate
warning about this possibly being a bad idea in the man page)
It might be useful to extend the syntax to specify more than one block
size, given that there are going to be multiple ones now.
> (2) Kernel minimum_io_size: The documentation says this is the
> "preferred minimum unit for random I/O".
>
> Current behaviour of nbd.ko is this is not set.
>
> I think the NBD's preferred block size should map to minimum_io_size.
>
>
> (3) Kernel optimal_io_size: The documentation says this is the
> "[preferred] streaming I/O [size]".
>
> Current behaviour of nbd.ko is this is not set.
>
> NBD doesn't really have the concept of streaming vs random I/O, so we
> could either ignore this or set it to the same value as
> minimum_io_size.
>
> I have a kernel patch allowing nbd-client to set both minimum_io_size
> and optimal_io_size from userspace.
>
>
> (4) Kernel blk_queue_max_hw_sectors: This is documented as: "set max
> sectors for a request ... Enables a low level driver to set a hard
> upper limit, max_hw_sectors, on the size of requests."
>
> Current behaviour of nbd.ko is that we set this to 65536 (sectors?
> blocks?), which for 512b sectors is 32M.
>
> I think we could set this to MIN (32M, NBD maximum block size constraint),
> converting the result to sectors.
>
> - - -
>
> What do people think?
Yes, this all looks reasonable to me. Thanks.
--
w(a)uter.{be,co.za}
wouter(a){grep.be,fosdem.org,debian.org}