Confirming that:
signal (SIGCHLD, SIG_DFL);
works.
Thank you, Rich!
Best Regards,
Peter
On Thu, Nov 8, 2018 at 5:07 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
On Thu, Nov 08, 2018 at 04:51:31PM +0200, Peter Dimitrov wrote:
> Here are strace outputs per process.
>
> strace_output.22076 is the plugin's pid. (A little before forking)
Ah ha.
Close reading of the waitpid(2) man page says:
ECHILD (for waitpid() or waitid()) The process specified by pid
(wait‐
pid()) or idtype and id (waitid()) does not exist or is
not a
child of the calling process. (This can happen for one's
own
child if the action for SIGCHLD is set to SIG_IGN. See also
the
Linux Notes section about threads.)
I'm going to guess that collectd is leaking the signal handler setting
into the child process instead of resetting it. This is a bug in
collectd.
It's surprisingly hard to correctly fork a process in Unix. Here's
what libvirt does, which is the most comprehensive code that I know
of. It involves resetting multiple things before running the child:
https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/vircommand.c;h=d...
While you're getting the collectd bug fixed, the easiest workaround is
probably to add:
signal (SIGCHLD, SIG_DFL);
in your code.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top