FYI: perf commands I'm using to benchmark nbdcopy
by Richard W.M. Jones
Hi Abhay,
FYI I thought I would document the successful commands I am using to
benchmark nbdcopy and produce the flame graphs that you saw this
morning. Attached is a very recent flame graph produced using this
method.
Firstly I'm running everything on Fedora 34, with selected packages
upgraded to Fedora Rawhide. However any reasonably recent Linux
distro should work fine. You will need to install the perf tool.
Compile libnbd & nbdkit from git source, following the instructions in
the respective README files.
https://gitlab.com/nbdkit/libnbd
https://gitlab.com/nbdkit/nbdkit
I have nbdkit and libnbd checked out in adjacent directories. This is
important so that commands like "./nbdkit" and "../libnbd/run nbdcopy"
work. There's more information about this in the READMEs.
I ran perf as below. Although nbdcopy and nbdkit themselves do not
require root (and usually should _not_ be run as root), in this case
perf must be run as root, so everything has to be run as root.
# perf record -a -g --call-graph=dwarf ./nbdkit -U - sparse-random size=1T --run "MALLOC_CHECK_= ../libnbd/run nbdcopy \$uri \$uri"
Some things to explain:
* The output is perf.data in the local directory. This file may be
huge (22GB for me!)
* I am running this from the nbdkit directory, so ./nbdkit runs the
locally compiled copy of nbdkit. This allows me to make quick
changes to nbdkit and see the effects immediately.
* I am running nbdcopy using "../libnbd/run nbdcopy", so that's from
the adjacent locally compiled libnbd directory. Again the reason
for this is so I can make changes, recompile libnbd, and see the
effect quickly.
* "MALLOC_CHECK_=" is needed because of complicated reasons to do
with how the nbdkit wrapper enables malloc-checking. We should
probably provide a way to disable malloc-checking when benchmarking
because it adds overhead for no benefit, but I've not done that yet
(patches welcome!)
* The test harness is nbdkit-sparse-random-plugin, documented here:
https://libguestfs.org/nbdkit-sparse-random-plugin.1.html
* I'm using DWARF debugging info to generate call stacks, which is
more reliable than the default (frame pointers).
* The -a option means I'm measuring events on the whole machine. You
can read the perf manual to find out how to measure only a single
process (eg. just nbdkit or just nbdcopy). But actually measuring
the whole machine gives a truer picture, I believe.
* If the test takes too long to run or runs out of space, try
adjusting the size (1T = 1 terabyte) downwards, eg. 512G, 256G, ...
until it fits. Although nbdkit doesn't store the virtual disk or
use very much memory at all, the test does appear to stress the
Linux VMM, and the amount of perf.data generated can be huge.
Then I run this long command to generate the flame graph. Again
it must be run as root:
# perf script | ../FlameGraph/stackcollapse-perf.pl | ../FlameGraph/flamegraph.pl > nbdcopy.svg
* This reads perf.data as input.
* Brendan Gregg's FlameGraph code is checked out in another adjacent
directory.
You can open the SVG file in a web browser. Try clicking around -
it's interactive.
If you get stuck, ask questions, we're here to help.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top
3 years, 6 months
[PATCH nbdkit] sparse-random: Don't generate random content in blocks by default
by Richard W.M. Jones
[I already pushed this upstream, this email is FYI]
As discussed earlier today, when testing nbdcopy with nbdkit-
sparse-random-plugin as the test harness, a very large amount of time
was spent generating random numbers to fill the data blocks. This was
pointless make-work, and this patch fixes it. More details in the
commit message.
Rich.
3 years, 6 months
[nbdkit PATCH 1/1] python: cleanup examples
by Michael Ablassmeier
Examples use print() instead of nbdkit.debug() to print debugging
information, which doesnt work. Replace calls to print with debug
function.
Signed-off-by: Michael Ablassmeier <abi(a)grinser.de>
---
plugins/python/examples/ramdisk.py | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/plugins/python/examples/ramdisk.py b/plugins/python/examples/ramdisk.py
index 2cde6aab..412ad31b 100644
--- a/plugins/python/examples/ramdisk.py
+++ b/plugins/python/examples/ramdisk.py
@@ -43,11 +43,11 @@ API_VERSION = 2
# This just prints the extra command line parameters, but real plugins
# should parse them and reject any unknown parameters.
def config(key, value):
- print("ignored parameter %s=%s" % (key, value))
+ nbdkit.debug("ignored parameter %s=%s" % (key, value))
def open(readonly):
- print("open: readonly=%d" % readonly)
+ nbdkit.debug("open: readonly=%d" % readonly)
# You can return any non-NULL Python object from open, and the
# same object will be passed as the first arg to the other
--
2.30.2
3 years, 6 months
[PATCH] templates: Add CentOS Stream 8
by Stef Walter
This adds CentOS Stream 8 to the set of images that the builder
can use. CentOS Stream 8 is a continuously delivered upstream of
the next RHEL 8.x release. It doesn't have a minor version number.
It appears that my index-fragment has a hash in it, and thus this
commit may need to be amended by the maintainer before merging
into the main repo.
This has been tested by creating a local centosstream-8.xz image
then consuming that image via a local repos.d file and using
virt-builder to build a bootable CentOS Stream 8 image:
$ cat ~/.config/virt-builder/repos.d/local.conf
[local]
uri=file:///data/src/guestfs-tools/builder/templates/index
$ virt-builder --list | grep stream
centosstream-8 x86_64 CentOS Stream 8
$ virt-builder centosstream-8
[ 1.0] Downloading: file:///data/src/guestfs-tools/build...
[ 1.7] Planning how to build this image
[ 1.7] Uncompressing
[ 7.8] Opening the new disk
[ 12.6] Setting a random seed
[ 12.6] Setting passwords
virt-builder: Setting random password of root to qmjtBFyLX...
[ 13.5] Finishing off
Output file: centosstream-8.img
Output size: 6.0G
Output format: raw
Total usable space: 5.4G
Free space: 3.8G (71%)
--
Stef Walter (he / his)
Linux Engineering
Red Hat
3 years, 6 months
Re: [Libguestfs] guestfish emergency extension
by Richard W.M. Jones
On Tue, May 18, 2021 at 09:36:12AM +0200, Gottschalk wrote:
> Hello Mr. Jones,
>
> we use guestfish 1.4
This version was released 11 years ago??
> we have an emergency of Data Recovery. We restored a ESXi 6.7 VMware-Image and
> wanted to read the data inside with guestfish.
>
> The structure of the vm is based on LVM.
>
> The first disk with the os inside a LV can be mounted correctly.
>
> Two disks a 400GB are covered in a Volumegroup. There is one logical Volume
> with 800GB in this VG.
>
> Guestfish does not read the filesystem in this LV.
>
> I tried
>
> ><fs>add disk1.vmdk
>
> ><fs>add disk2.vmdk
>
> ><fs>add disk3.vmdk
>
> ><fs>run
>
> ><fs>list-devices
> /dev/sda
> /dev/sdb
> /dev/sdc
>
> ><fs>list-partitions
> /dev/sda1
> On sdb and sdc there are no partitions. The LV resides directly on these disks.
> I simulated this with partitioned disks and identically result.
>
> behavior without partitions on sdb and sdc
> ><fs> list-filesystems
> /dev/mailxxx-vg/root: ext4
> /dev/mailxxx-vg/swap_1: swap
> /dev/vmail-vg/vmail-lv: unknown
>
> behavior with partitions on sdb and sdc
> ><fs> list-filesystems
> libguestfs: error: list_filesystems: parted exited with status 1: Error: Can't
> have a partition outside the disk!
> ><fs>
>
>
> ><fs>mount /dev/mailxxx-vg/root /
> works fine
> ><fs>mount/dev/vmail-vg/vmail-lv /
> libguestfs: error: mount: mount_stub: /dev/vmail-vg/vmail-lv: expecting a
> device name
>
> How can i get access to this LV?
> Can you help me please?
> Thank you very much.
I would suggest enabling debugging to start with, which should give
you more information about what is actually going on. This section of
the FAQ will explain how to do that:
https://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
3 years, 6 months
Building nbdkit with mingw and wine
by Richard W.M. Jones
The detailed instructions are at the bottom of the README if you
search for "WINDOWS":
https://gitlab.com/nbdkit/nbdkit/-/blob/master/README
These basic commands should work:
$ mingw64-configure --disable-ocaml --disable-perl --disable-vddk
$ mingw64-make
and the binary should be runnable if Wine is installed, eg:
$ ./nbdkit.exe --dump-config
...
host_cpu=x86_64
host_os=mingw32
...
Getting the tests to work under Wine is ... well, a tad more tedious.
There are several issues:
- Windows now supports AF_UNIX, but Wine does not.
- Most tests use Unix domain sockets (ie. nbdkit -U) because using
possibly public TCP ports during tests is not cool and also
difficult to deconflict. So AF_UNIX support is really needed to
run nbdkit tests in any detail.
- I patched Wine to add AF_UNIX support (see attached). However
since then Wine code has changed quite significantly in the way it
creates sockets so my patch doesn't even remotely work and I can't
see an easy way to fix it. Therefore you must apply the attached
patch on top of 7ec069d85f5235db ("ntdll/tests: Fix virtual test
failures on win10pro.") Only 5592 commits behind head!
- ./configure --enable-win64 && make
- For some reason I had to hand-edit dlls/gdi32/Makefile to remove
the tools/wrc/wrc -pthread parameter. I'm pretty sure I didn't
need to do that before. There could be an upstream commit we are
missing.
If you do all of that and build Wine then you should be able to set
$PATH to point to the patched Wine build directory and run the tests
as so:
$ PATH=~/d/wine:$PATH mingw64-make check -C tests
Tests in !tests directory fail for miscellaneous reasons.
Most tests will be skipped but for me at least a significant number
pass. I get:
============================================================================
Testsuite summary for nbdkit 1.25.7
============================================================================
# TOTAL: 203
# PASS: 75
# SKIP: 128
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
============================================================================
I would recommend adding “WINEDEBUG=warn+all” if nothing seems to be
working.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top
3 years, 6 months
Re: [Libguestfs] Performance issue: "udevadm settle" is called three times for a single call of guestfs_download_offset() (#67)
by Richard W.M. Jones
On Mon, May 17, 2021 at 10:21:33PM -0700, Motohiro Kawahito wrote:
> I'm planning to use guestfs_download_offset() to read a part of
> /dev/sda. I made a micro-benchmark for it.
What is the host file? Raw? qcow2?
> I found that a single 512-bytes read access takes 60msec. It is very slow.
The whole architecture is not designed for performance in this sort of
case. Essentially it is single threaded and making a round trip on
every 512 byte request. See:
https://libguestfs.org/guestfs-internals.1.html#architecture
> I investigated a reason of it. I found that "udevadm settle -E /dev/sda" is
> invoked 3 times and that these three calls take 57msec.
>
> The command udevadm is called from the beginning of is_device_parameter(), as
> shown below.
> https://github.com/libguestfs/libguestfs/blob/047cf7dcd26e649d45e7e21a3b6...
>
> I have the following two questions:
>
> • Can we defer calling udev_settle_file (device) after next two IFs? (That
> is, after Line 128) I tried it, and the performance was improved to 3msec.
Yes, that should be fine, although I don't think it will make your
case faster.
> • Can we cache the result of is_device_parameter() and then use it for the
> other two calls?
Possibly. Do you know where it is being called from 3 times?
Stepping back, I would say that this is the wrong way to go about this
(unless you just want to benchmark libguestfs itself, which will only
tell you that it's slow). If the host file is raw, you can just read
it. If the host file is qcow2, try something like this instead, it
will be far faster:
$ nbdsh \
-c 'h.connect_systemd_socket_activation(["qemu-nbd", "-r", "-f", "qcow2", "test.qcow2"])' \
-c 'print ("%r" % h.pread(512,0))'
You could also write it in C:
https://rwmj.wordpress.com/2019/10/03/how-to-edit-a-qcow2-file-from-c/
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
3 years, 6 months