yep, we don't wait for the process inside cpopen code - we expect the caller to take care of it. if you use execCmd with sync=False (as in virt-v2v) - you have to take care for killing the process and wait for its pid after its done - you can see that with the fix  we raise exception and this exception is caught in _import func and call _abort which calls kill and sign the pid to zombiereaper which waits for it to die.
If you use execCmd with sync=True (the default) you can see that we wait until process exists and return.

On Wed, Mar 30, 2016 at 5:30 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Mar 30, 2016 at 3:32 PM, Shahar Havivi <shaharh@redhat.com> wrote:
> On 30.03.16 14:28, Nir Soffer wrote:
>> On Wed, Mar 30, 2016 at 1:36 PM, Michal Skrivanek
>> <michal.skrivanek@redhat.com> wrote:
>> >
>> >> On 30 Mar 2016, at 11:49, Richard W.M. Jones <rjones@redhat.com> wrote:
>> >>
>> >> On Wed, Mar 30, 2016 at 12:19:35PM +0300, Shahar Havivi wrote:
>> >>> Hi,
>> >>>
>> >>> We encounter a problem in VDSM project that virt-v2v become zombie task while
>> >>> importing vm from vmware.
>> >>> When virt-v2v is in 'copy disk' mode and we someone deletes the vm at vmware
>> >>> the process hang in read() method,
>> >>> I am pretty sure that its not virt-v2v problem because when I run it from the
>> >>> shell virt-v2v exit with an error, still maybe someone have an idea....
>> >>>
>> >>> I wrote a small python script that encounter the problem:
>> >>>
>> >>> ----------------------------------------------------------------------------
>> >>> from cpopen import CPopen
>> >>>
>> >>> env = {'LIBGUESTFS_BACKEND': 'direct'}
>> >>> cmd = ['/usr/bin/virt-v2v', '-ic',
>> >>>       'vpx://....', '-o',
>> >>>       'local', '-os', '/tmp', '-of', 'raw', '-oa', 'sparse',
>> >>>       '--password-file', '/tmp/passwd', '--machine-readable', 'bbb']
>> >>> p = CPopen(cmd, env=env)
>> >>> while p.returncode is None:
>>
>> p.returncode just return the instance variable, there is no wait() involved.
>>
>> The right way:
>>
>>     while p.poll() is None:
> the problem is proc.stdout.read(1) didn't raise and error when the stream
> closed but return ''.
> its a CPopen behaviour and works differently in subprocess.

No, it works the same in both, this was just a bug in our code.
Fixed in https://gerrit.ovirt.org/55477

>
>>         ...
>>
>> p.returncode calling wait is non-standard feature in vdsm AsyncProc
>> wrapper. This is
>> the object used by v2v vdsm module, so there accessing p.returncode does call
>> p.poll().
>>
>> These non-standard apis will be removed from vdsm, please do not use them.
>>
>> >>>    c = p.stdout.read(1)
>> >>>    print c
>> >>> ----------------------------------------------------------------------------
>> >>
>> >> An actual zombie task?  That would indicate that the parent process
>> >> (your Python program) wasn't doing a wait system call.
>> >>
>> >> I downloaded the cpopen-1.4 program, and it doesn't appear to call any
>> >> of the wait*(2) system calls anywhere, so that could be the problem.
>> >
>> > I suppose the cpopen parameters are not alright…I’m sure vdsm developers can help with that.
>> >
>> >>
>> >> Rich.
>> >>
>> >> --
>> >> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
>> >> Read my programming and virtualization blog: http://rwmj.wordpress.com
>> >> virt-p2v converts physical machines to virtual machines.  Boot with a
>> >> live CD or over the network (PXE) and turn machines into KVM guests.
>> >> http://libguestfs.org/virt-v2v
>> >
>> > _______________________________________________
>> > Devel mailing list
>> > Devel@ovirt.org
>> > http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel



--
Yaniv Bronhaim.