On Wed, May 26, 2021 at 10:32:08AM +0100, Richard W.M. Jones wrote:
On Wed, May 26, 2021 at 11:40:11AM +0300, Nir Soffer wrote:
> On Tue, May 25, 2021 at 9:06 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
> > I ran perf as below. Although nbdcopy and nbdkit themselves do not
> > require root (and usually should _not_ be run as root), in this case
> > perf must be run as root, so everything has to be run as root.
> >
> > # perf record -a -g --call-graph=dwarf ./nbdkit -U - sparse-random size=1T
--run "MALLOC_CHECK_= ../libnbd/run nbdcopy \$uri \$uri"
>
> This uses 64 requests with a request size of 32m. In my tests using
> --requests 16 --request-size 1048576 is faster. Did you try to profile
> this?
Interesting! No I didn't. In fact I just assumed that larger request
sizes / number of parallel requests would be better.
This is the topology of the machine I ran the tests on:
https://rwmj.files.wordpress.com/2019/09/screenshot_2019-09-04_11-08-41.png
Even a single 32MB buffer isn't going to fit in any cache, so reducing
buffer size should be a win, and once they are within the size of the
L3 cache, reusing buffers should also be a win.
That's the theory anyway ... Using --request-size=1048576 changes the
flamegraph quite dramatically (see new attachment).
[What is the meaning of the swapper stack traces? They are coming
from idle cores?]
Test runs slightly faster:
$ hyperfine 'nbdkit -U - sparse-random size=1T --run "nbdcopy \$uri
\$uri"'
Benchmark #1: nbdkit -U - sparse-random size=1T --run "nbdcopy \$uri \$uri"
Time (mean ± σ): 47.407 s ± 0.953 s [User: 347.982 s, System: 276.220 s]
Range (min … max): 46.474 s … 49.373 s 10 runs
$ hyperfine 'nbdkit -U - sparse-random size=1T --run "nbdcopy
--request-size=1048576 \$uri \$uri"'
Benchmark #1: nbdkit -U - sparse-random size=1T --run "nbdcopy
--request-size=1048576 \$uri \$uri"
Time (mean ± σ): 43.796 s ± 0.799 s [User: 328.134 s, System: 252.775 s]
Range (min … max): 42.289 s … 44.917 s 10 runs
(Note the buffers are still not being reused.)
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v