128 threads is out of the question for us primarily due memory concerns when running multiple disks off one server and because of the nature of our ops (high latency, low cpu). Our plugin forwards reads/writes as requests to cloud storage and uses asio to service our socket ops. Our plugin ops are primarily network throughput bound or op latency bound but not cpu bound. 128 threads per disk would be mostly sitting idle and when we run 8 disks on a server we don't want 1000 threads around to have 1000 cloud reads going in parallel when 16 threads would have been more than enough for 8 disks.

I have cloned nbdkit from upstream (yesterday) and I am attempting to redo the critical changes into the latest nbdkit code and will post a patch when something is workable. This may take me a few weeks as I am only really able to do this work on nights and weekends right now but we do want to contribute this upstream if possible so others may use a similar async approach to plugin requests rather than the many threads approach.

The main changes I'm trying to make are

1) break the recv_request_send_reply function into two distinct steps (recv_request and send_reply)

2) create an alternative plugin struct (lowlevel? name up for suggestions, fuse uses fuse_lowlevel.h) that requires the plugin functions to call the reply helper funcs (the nbdkit internals don't send replies for this plugin unless an error is returned starting the op)

3) wrap the existing plugins.c function calls when not a lowlevel plugin to call the new send_reply func as we are not calling that in the recv_request func unless an error is hit before the plugin level

4) document new lowlevel plugin requirements

- Shaun

On Sun, Jan 21, 2018 at 5:16 PM, Richard W.M. Jones <rjones@redhat.com> wrote:

On Fri, Jan 19, 2018 at 12:41:01PM -0500, Shaun McDowell wrote:
[...]

Thanks for the other email, I thought it was very interesting and I'm
glad that people are looking at the performance of nbdkit seriously.

> Our cbdkit was branched from 1.1.12 with some patches from 1.1.13 added as
> well. The project has morphed significantly enough that a direct diff or
> merging between the two would not be feasible. Even the structure of the
> project directory and build has been changed to be in line with our other
> internal projects.
>
> I have uploaded the entire cbdkit source to our github at
> https://github.com/dev-cloudbd/cbdkit
[...]

As you say the structure is quite a lot different, making it difficult
for me to use any of this work at the moment. I do have a couple of
questions though ...

- Is a multithreaded approach (as Eric has now implemented) completely
out of the question? I'm guessing the problem will be with memory
usage for all of those thread stacks. You mentioned memory usage of
100MB and maybe that's important for your cloud use case?

- Are you going to try to pull any changes from upstream nbdkit or is
the fork now too large to try?

I think if you wanted to get any of this upstream [it wasn't really
clear if you do, and of course licensing-wise it's entirely optional
for you to release any changes at all], but if you did then maybe see
if there are simple patches that could go first.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW