128 threads is out of the question for us primarily due memory concerns when running multiple disks off one server and because of the nature of our ops (high latency, low cpu). Our plugin forwards reads/writes as requests to cloud storage and uses asio to service our socket ops. Our plugin ops are primarily network throughput bound or op latency bound but not cpu bound. 128 threads per disk would be mostly sitting idle and when we run 8 disks on a server we don't want 1000 threads around to have 1000 cloud reads going in parallel when 16 threads would have been more than enough for 8 disks.
I have cloned nbdkit from upstream (yesterday) and I am attempting to redo the critical changes into the latest nbdkit code and will post a patch when something is workable. This may take me a few weeks as I am only really able to do this work on nights and weekends right now but we do want to contribute this upstream if possible so others may use a similar async approach to plugin requests rather than the many threads approach.
The main changes I'm trying to make are
1) break the recv_request_send_reply function into two distinct steps (recv_request and send_reply)
2) create an alternative plugin struct (lowlevel? name up for suggestions, fuse uses fuse_lowlevel.h) that requires the plugin functions to call the reply helper funcs (the nbdkit internals don't send replies for this plugin unless an error is returned starting the op)
3) wrap the existing plugins.c function calls when not a lowlevel plugin to call the new send_reply func as we are not calling that in the recv_request func unless an error is hit before the plugin level
4) document new lowlevel plugin requirements
- Shaun