On Tue, Aug 30, 2022 at 09:04:07AM +0100, Richard W.M. Jones wrote:
On Tue, Aug 30, 2022 at 10:32:02AM +0800, Ming Lei wrote:
> Hi Jones,
>
> On Thu, Aug 25, 2022 at 01:10:55PM +0100, Richard W.M. Jones wrote:
> > This patch adds simple support for a ublk-based NBD client.
> > It is also available here:
> >
https://gitlab.com/rwmjones/libnbd/-/tree/nbdublk/ublk
> >
> > ublk is a way to write Linux block device drivers in userspace:
>
> Just looked at your nbdublk implementation a bit, basically it is good,
> and one really nice work.
>
> Also follows two suggestions:
>
> 1) the io_uring context is multilexed with ublk io command handling, so
> we should avoid to block in both ->handle_io_async() and
> ->handle_event(), otherwise performance may be bad
The nbd_aio_* calls don't block.
However I noticed that I made a mistake with the trim and zero paths
because I am using synchronous (blocking) nbd_flush / nbd_trim /
nbd_zero instead of nbd_aio_flush / nbd_aio_trim / nbd_aio_zero. I
will fix this soon.
Nothing in handle_event should block except for the call to
pthread_mutex_lock. This lock is necessary because new commands can
be retired on the nbd_work_thread while handle_event is being called
from the io_uring thread.
Yes, I meant the mutex.
Also given nbd_work_thread() is required, it is reasonable to move
nbd_aio_* into nbd_work_thread(), so avoid to wait on the mutex in
io_uring context, then io command may not be forwarded to userspace
in time.
BTW what is the context for running callback of nbd_aio_*()?
> 2) in the implementation of nbd worker thread, there are two sleep
> points(wait for incoming io command, and network FD), I'd suggest to use
> poll to wait on any of them
>
> Recently I are working to add ublksrv io offloading or aio
> interfaces on this sort of case in which io_uring can't be used,
> which may simplified this area, please see the attached patch which
> applies the above two points against your patch. And obvious
> improvement can be observed on my simple fio test( randread, io, 4k
> bs, libaio) against backend of 'nbdkit file'.
>
> But these interfaces aren't merged to ublksrv github tree yet, you can find
> them in the aio branch, and demo_event.c is one example wrt. how to use
> them:
>
>
https://github.com/ming1/ubdsrv/tree/aio
>
> Actually this interface can be improved further for nbdublk case,
> and the request allocation isn't needed actually for this direct
> offloading. But they are added for covering some IOs not from ublk
> driver, such as meta data, so 'struct ublksrv_aio' is allocated.
> I will try best to finalize them and merge to master branch.
I didn't really understand what these patches to ubdsrv do when I
looked at them before. Maybe add some diagrams?
The patches are added recently. It is just for simplifying to offload
IO handling on another worker context, such as nbd_work_thread.
It uses one eventfd to notify the target worker thread when new requests
are added to a list. Once the worker thread is wakeup, it fetches
requests from the list, after handle these requests, the worker thread
waits on both the eventfd and target IO(network FD) via poll().
In your previous implementation, nbd work thread may wait on one pthread
mutex and aio_poll(), this way may not be efficient, given when waiting
on one event, another events can't be handled.
> BTW, IOPS on nbdublk(backend: nbdkit file) still has big gap compared
> with ublk-loop, so I guess in future maybe io_uring should be tried and
> see if big improvement can be observed.
It's always going to be a bit slower because we're converting the
requests into a network protocol and passing them to another process.
In my simple fio randread test, it can be 5+ ~ 10+ slower compared with loop.
Thanks,
Ming