On Thu, Jul 07, 2022 at 02:35:41PM -0500, Eric Blake wrote:
 > +++ b/copy/copy-file-to-qcow2-compressed.sh
 
 > +
 > +# Create a compressed qcow2 file1.
 > +#
 > +# sparse-random files should compress easily because by default each
 > +# block uses repeated bytes.
 > +qemu-img create -f qcow2 $file1 $size
 > +nbdcopy -- [ nbdkit --exit-with-parent sparse-random $size seed=$seed ] \
 > +        [ $QEMU_NBD --image-opts
driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=$file1 ]
 
 Long line, but not much you can do about it, short of adding in more
 use of temporary shell variables.  If you think it adds legibility,
 this could be something like:
 
 opts=driver=compress
 opts+=,file.driver=qcow2
 opts+=,file.file.driver=file
 opts+=,file.file.filename=$file1
 nbdcopy -- [ ... ] [ $QEMU_NBD --image-opts "$opts" ] 
Thanks - updated as suggested in 05ef84e06357.
 > +++ b/copy/main.c
 > @@ -40,6 +40,7 @@
 >  
 >  #include "ispowerof2.h"
 >  #include "human-size.h"
 > +#include "minmax.h"
 >  #include "version.h"
 >  #include "nbdcopy.h"
 >  
 > @@ -379,10 +380,22 @@ main (int argc, char *argv[])
 >    if (threads < connections)
 >      connections = threads;
 >  
 > +  /* request_size must always be at least as large as the preferred
 > +   * size of source & destination.
 > +   */
 > +  request_size = MAX (request_size, src->preferred);
 > +  request_size = MAX (request_size, dst->preferred);
 > +
 >    /* Adapt queue to size to request size if needed. */
 >    if (request_size > queue_size)
 >      queue_size = request_size;
 >  
 > +  /* Sparse size (if using) must not be smaller than the destination
 > +   * preferred size, otherwise we end up creating too small requests.
 > +   */
 > +  if (sparse_size > 0 && sparse_size < dst->preferred)
 > +    sparse_size = dst->preferred;
 > +
 
 Do we need to check anywhere that a command-line --request-size is a
 power of 2, and does not exceed the size of how much extents map we
 are willing to collect in one go? 
We initially check request_size is a power of 2 when parsing it from
the command line, but it (was) possible that preferred block size is
not, in which case the above statements could change it so it's not a
power of 2.
I've added commit 06222abbd34f upstream which forces the preferred
values to always be powers of 2 and adds more checks.  Hopefully now
both request_size and sparse_size should always be powers of 2.
 > +        /* If we were in the middle of deferred zeroing, do it
now. */
 > +        if (is_zeroing) {
 > +          /* Note that offset-zeroing_start can never exceed
 > +           * THREAD_WORK_SIZE, so there is no danger of overflowing
 > +           * size_t.
 
 Here's one reason why a runtime check that command-line --request-size
 doesn't exceed THREAD_WORK_SIZE might be worthwhile (that would be in
 main, not here, though). 
Added in commit 4dd69b44291.
 > +           */
 > +          command = create_command (zeroing_start, offset-zeroing_start,
 > +                                    true, w);
 > +          fill_dst_range_with_zeroes (command);
 > +          is_zeroing = false;
 >          }
 > +
 > +        /* Issue the asynchronous read command. */
 > +        command = create_command (offset, len, false, w);
 > +
 > +        wait_for_request_slots (w);
 > +
 > +        /* NOTE: Must increase the queue size after waiting. */
 > +        increase_queue_size (w, len);
 > +
 > +        /* Begin the asynch read operation. */
 > +        src->ops->asynch_read (src, command,
 > +                               (nbd_completion_callback) {
 > +                                 .callback = finished_read,
 > +                                 .user_data = command,
 > +                               });
 
 I'm not sure if asynch_read() can be made more efficient when the
 source has a smaller granularity than --request-size or the
 destination, so that we aren't reading zeroes for the subset of the
 buffer.  But that is performance, not correctness (it is always
 correct to read zeroes); the question is whether it is wasted effort,
 and since we know there is data at least somewhere in the window of
 the read, this may be in the noise.  I'm fine with the current
 implementation; we may have few enough mixed-source clusters that we
 could actually penalize ourselves with the overhead of trying to avoid
 sub-region reads. 
It is definitely wasted effort.  I punted on an implementation which
tries to follow the extent map for the source when the source
granularity is smaller than the destination (just for sanity!)
All the internal tests pass, plus nbdkit tests pass when using
nbdcopy @4dd69b44291.
Thanks,
Rich.
-- 
Richard Jones, Virtualization Group, Red Hat 
http://people.redhat.com/~rjones
Read my programming and virtualization blog: 
http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  
http://libguestfs.org