> That makes no sense because we are supposed to have just forked
successfully

I just realized libguestfs uses fork. Now we know why qemu-img worked - I launched it with popen.

> So it must be something to do with collectd and how it runs programs.
> Is it using LD_PRELOAD trickery, or replacing libc, or using seccomp?

If I understand the question correctly - it's about how collectd loads its plugins? If so it uses:
static int plugin_load_file(const char *file, _Bool global) {
  void (*reg_handle)(void);
  int flags = RTLD_NOW;
    if (global)
      flags |= RTLD_GLOBAL;
  void *dlh = dlopen(file, flags);
  //...
  reg_handle = (void (*)(void))dlsym(dlh, "module_register");
  //...
  (*reg_handle)();
}

Does this give any clues?

Best Regards,
Peter


On Wed, Nov 7, 2018 at 12:56 PM Richard W.M. Jones <rjones@redhat.com> wrote:
On Wed, Nov 07, 2018 at 12:32:48PM +0200, Peter Dimitrov wrote:
> Thank you, Rich,
> This was the issue indeed. export LIBGUESTFS_BACKEND=direct fixed it.
>
> The next step I tried was to integrate libguestfs in collectd virt plugin
> to collect this data automatically.
> In this case I'm having an unknown error in add_libvirt_dom() (same with
> add_domain) when it's invoking qemu-img to create overlay image.
>
> There is no difference between manual and service execution.
> I tried setting LIBGUESTFS_BACKEND to direct,
> libvirt, libvirt:qemu:///session with no success.
> Also tried using a different tmp dir just in case - nothing.
>
> Maybe something is wrong with how collectd runs its plugins (dynamic
> linking)?
> Invoking virt-df from collectd's plugin gives the same error message.
> I tried running the same qemu-img command from collectd and it passes
> though! Confusing...

The log indicates something a bit strange is going on:

> libguestfs: command: run: qemu-img
> libguestfs: command: run: \ create
> libguestfs: command: run: \ -f qcow2
> libguestfs: command: run: \ -o
> backing_file=/home/peterd/TVE/wer.qcow2,backing_fmt=qcow2
> libguestfs: command: run: \ /tmp/libguestfsUIZbDK/overlay1.qcow2
> Formatting '/tmp/libguestfsUIZbDK/overlay1.qcow2', fmt=qcow2
> size=107374182400 backing_file=/home/peterd/TVE/wer.qcow2 backing_fmt=qcow2
> encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
> libguestfs: error: command: waitpid: No child processes
> libguestfs: error: qemu-img: /tmp/libguestfsUIZbDK/overlay1.qcow2: qemu-img
> exited for an unknown reason (status -1), see debug messages above

Obviously waitpid(2) is failing with ECHILD here:

https://github.com/libguestfs/libguestfs/blob/3430c2dd654b19a55d213a9302ac5e4b6a387bee/lib/command.c#L741

That makes no sense because we are supposed to have just forked
successfully:

https://github.com/libguestfs/libguestfs/blob/3430c2dd654b19a55d213a9302ac5e4b6a387bee/lib/command.c#L479

called from:

https://github.com/libguestfs/libguestfs/blob/3430c2dd654b19a55d213a9302ac5e4b6a387bee/lib/command.c#L764

Notice also that qemu-img *does* run (you can see the output from the
command).

So it must be something to do with collectd and how it runs programs.
Is it using LD_PRELOAD trickery, or replacing libc, or using seccomp?
My guess is that any program which launched a subprocess and then
waited for it would fail in the same way.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/