On Wed, May 26, 2021 at 02:50:32PM +0300, Nir Soffer wrote:
Basically all give very similar results.
# hyperfine "./copy-libev $SRC $DST" "qemu-img convert -n -W -m 16 -S
1048576 $SRC $DST" "../copy/nbdcopy --sparse=1048576
--request-size=1048576 --flush --requests=16 --connections=1 $SRC
$DST"
Benchmark #1: ./copy-libev nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock
Time (mean ± σ): 103.514 s ± 0.836 s [User: 7.153 s, System: 19.422 s]
Range (min … max): 102.265 s … 104.824 s 10 runs
Benchmark #2: qemu-img convert -n -W -m 16 -S 1048576
nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock
Time (mean ± σ): 103.104 s ± 0.899 s [User: 2.897 s, System: 25.204 s]
Range (min … max): 101.958 s … 104.499 s 10 runs
Benchmark #3: ../copy/nbdcopy --sparse=1048576 --request-size=1048576
--flush --requests=16 --connections=1
nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock
Time (mean ± σ): 104.085 s ± 0.977 s [User: 7.188 s, System: 19.965 s]
Range (min … max): 102.314 s … 105.153 s 10 runs
In my testing, nbdcopy is a clear 4x faster than qemu-img convert, with
4 also happening to be the default number of connections/threads.
Why use nbdcopy --connections=1? That completely disables threads in
nbdcopy. Also I'm not sure if --flush is fair (it depends on what
qemu-img does, which I don't know).
The other interesting things are the qemu-img convert flags you're using:
-m 16 number of coroutines, default is 8
-W out of order writes, but the manual says "This is only recommended
for preallocated devices like host devices or other raw block
devices" which is a very unclear recommendation to me.
What's special about host devices versus (eg) files or
qcow2 files which means -W wouldn't always be recommended?
Anyway I tried various settings to see if I could improve the
performance of qemu-img convert vs nbdcopy using the sparse-random
test harness. The results seem to confirm what has been said in this
thread so far.
libnbd-1.7.11-1.fc35.x86_64
nbdkit-1.25.8-2.fc35.x86_64
qemu-img-6.0.0-1.fc35.x86_64
$ hyperfine 'nbdkit -U - sparse-random size=100G --run "qemu-img convert \$uri
\$uri"' 'nbdkit -U - sparse-random size=100G --run "qemu-img convert -m
16 -W \$uri \$uri"' 'nbdkit -U - sparse-random size=100G --run "nbdcopy
\$uri \$uri"' 'nbdkit -U - sparse-random size=100G --run "nbdcopy
--request-size=1048576 --requests=16 \$uri \$uri"'
Benchmark #1: nbdkit -U - sparse-random size=100G --run "qemu-img convert \$uri
\$uri"
Time (mean ± σ): 17.245 s ± 1.004 s [User: 28.611 s, System: 7.219 s]
Range (min … max): 15.711 s … 18.930 s 10 runs
Benchmark #2: nbdkit -U - sparse-random size=100G --run "qemu-img convert -m 16 -W
\$uri \$uri"
Time (mean ± σ): 8.618 s ± 0.266 s [User: 33.091 s, System: 7.331 s]
Range (min … max): 8.130 s … 8.943 s 10 runs
Benchmark #3: nbdkit -U - sparse-random size=100G --run "nbdcopy \$uri \$uri"
Time (mean ± σ): 5.227 s ± 0.153 s [User: 34.299 s, System: 30.136 s]
Range (min … max): 5.049 s … 5.439 s 10 runs
Benchmark #4: nbdkit -U - sparse-random size=100G --run "nbdcopy
--request-size=1048576 --requests=16 \$uri \$uri"
Time (mean ± σ): 4.198 s ± 0.197 s [User: 32.105 s, System: 24.562 s]
Range (min … max): 3.868 s … 4.474 s 10 runs
Summary
'nbdkit -U - sparse-random size=100G --run "nbdcopy --request-size=1048576
--requests=16 \$uri \$uri"' ran
1.25 ± 0.07 times faster than 'nbdkit -U - sparse-random size=100G --run
"nbdcopy \$uri \$uri"'
2.05 ± 0.12 times faster than 'nbdkit -U - sparse-random size=100G --run
"qemu-img convert -m 16 -W \$uri \$uri"'
4.11 ± 0.31 times faster than 'nbdkit -U - sparse-random size=100G --run
"qemu-img convert \$uri \$uri"'
## Compare nbdcopy request size with 16 requests and one connection
This is testing 4 connections I think? Or is the destination not
advertising multi-conn?
## Compare number of requests with multiple connections
To enable multiple connections to the destination, I hacked nbdcopy
to ignore the the destination can_multicon always use multiple
connections. This is how we use qemu-nbd with imageio in RHV.
So qemu-nbd doesn't advertise multi-conn? I'd prefer if we fixed qemu-nbd.
Anyway, interesting stuff, thanks.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html