Re: [Libguestfs] nbdcpy: from scratch nbdcopy using io_uring

Monday, 23 August 2021

I had an idea for optimizing my current approach, it's good in some
ways but can be faster with some breaking changes to the protocol.

Currently, we read (from socket connected to source) one request at a time
the simple flow looks like `read_header(io_uring) ---- success --->
recv(data) --- success ---> send(data) & queue another read header`
but it's not as efficient as it could be at best it's a hack.

Another approach I am thinking about is a large buffer
where we can read all of the socket's data and process packets from
that buffer as all the I/O is handled.
this minimizes the number of read requests to the kernel as we do 1
read for multiple NBD packets.

Further optimization requires changing the NBD protocol a bit
Current protocol
1. Memory representation of a response (20-byte header + data)
2. Memory representation of a request (28-byte header + data)

HHHHH_DDDDDDDDD...
HHHHHHH_DDDDDDDDD...

H and D represent 4 bytes, _ represents 0 bytes

With the large buffer approach, we read data into a large buffer, then
copy the NBD packet's data to a new buffer, strap a new header to it
and send it.
This copying is what we wanted to avoid in the first place.

If the response header was 28 bytes or the first 8-bytes of data were
useless we could have just overwritten the header part and sent data
directly from the large buffer, therefore avoiding the copy.

What are your thoughts?

Thanks and Regards.
Abhay

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] nbdcpy: from scratch nbdcopy using io_uring