On Tue, Feb 15, 2022 at 5:54 PM Richard W.M. Jones <rjones@redhat.com> wrote:
Pick the nbdcopy --requests parameter to target an implicit buffer
size of 64M inside nbdcopy.  However don't set nbdcopy --request < 64.

If request_size == 256K (the default) => requests = 256
If request_size == 8M => requests = 64 (buffer size 512M)

Considering the total bytes buffered makes sense. I did the same in another
application that only reads from NBD using libnbd async API. I'm using:

max_requests = 16
max_bytes = 2m

So if you have small requests (e.g. 4k), you get 16 inflight requests per connection
and with 4 connections 64 inflight requests on the storage side.

But if you have large requests (256k), you get only 8 requests per connection and
32 requests on the storage side.

This was tested in a read-only case both on my laptop with fast NVMe
(Samsung 970 EVO Plus 1T) and with super fast NVMe on Dell server,
and with shared storage (NetApp iSCSI).

With fast NVMe, limiting the maximum buffered bytes to 1M is actually
~10% faster, but with shared storage using more requests is faster.

What you suggest here will result in:
small requests: 256 requests per connection, 1024 requests on storage side
large requests: 64 requests per connection, 156 requests on storage side.

I don't think any storage can handle such a large amount of connections better.

I think we should test --requests 8 first, it may show nice speedup comapred
to what we see in 

Looks like in 

We introduced 2 changes at the same time, which makes it impossible to tell
the effect of any single change.

 v2v/v2v.ml | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/v2v/v2v.ml b/v2v/v2v.ml
index cadf864d5c..7bd47c1e7e 100644
--- a/v2v/v2v.ml
+++ b/v2v/v2v.ml
@@ -641,14 +641,27 @@ and nbdcopy ?request_size output_alloc input_uri output_uri =
   let cmd = ref [] in
   List.push_back_list cmd [ "nbdcopy"; input_uri; output_uri ];
   (match request_size with
     | None -> ()
     | Some size -> List.push_back cmd (sprintf "--request-size=%d" size)
+  (* Choose max requests to target an implicit buffer size of 64M. *)
+  let requests =
+    let target_buffer_size = 64 * 1024 * 1024 in
+    let request_size =
+      match request_size with
+      | None -> 256 * 1024 (* default in nbdcopy 1.10+ *)
+      | Some size -> size in
+    min 64 (target_buffer_size / request_size) in
+  List.push_back cmd (sprintf "--requests=%d" requests);
   List.push_back cmd "--flush";
   (*List.push_back cmd "--verbose";*)
   if not (quiet ()) then List.push_back cmd "--progress";
   if output_alloc = Types.Preallocated then List.push_back cmd "--allocated";
   let cmd = !cmd in

   if run_command cmd <> 0 then