Confirming that:

signal (SIGCHLD, SIG_DFL);

works.

Thank you, Rich!
Best Regards,
Peter


On Thu, Nov 8, 2018 at 5:07 PM Richard W.M. Jones <rjones@redhat.com> wrote:
On Thu, Nov 08, 2018 at 04:51:31PM +0200, Peter Dimitrov wrote:
> Here are strace outputs per process.
>
> strace_output.22076 is the plugin's pid. (A little before forking)

Ah ha.

Close reading of the waitpid(2) man page says:

       ECHILD (for  waitpid() or waitid()) The process specified by pid (wait‐
              pid()) or idtype and id (waitid()) does not exist or  is  not  a
              child  of  the  calling process.  (This can happen for one's own
              child if the action for SIGCHLD is set to SIG_IGN.  See also the
              Linux Notes section about threads.)

I'm going to guess that collectd is leaking the signal handler setting
into the child process instead of resetting it.  This is a bug in
collectd.

It's surprisingly hard to correctly fork a process in Unix.  Here's
what libvirt does, which is the most comprehensive code that I know
of.  It involves resetting multiple things before running the child:

  https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/vircommand.c;h=de937f6f9aa91abb518eac98bfac9dcf37e1f5df;hb=HEAD#l280

While you're getting the collectd bug fixed, the easiest workaround is
probably to add:

  signal (SIGCHLD, SIG_DFL);

in your code.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top