I eliminated software-bridge( using iperf3 bandwidth/latency on the host and guest was almost the same) and 
disk write(using qemu-img convert src dest file both on the host and guestvm was comprable) being the issue.
 
Next I profiled both sshd and qemu-img during the conversion phase using bpftools (profile and tcptop) and this is what i see.

Time (min)

File Size

(KB)

sshd RX

(KB)

tcptop

qemu-img RX

(KB)

tcptop

profile (flamegraph)

0 - 5

5478400

5612971

5601536

qemu-img-0-5.svg

5 - 10

8365440 (+2887040)

5669043

5656517

qemu-img-5-10.svg

10 - 15

8423296 (+57856)

5722376

5708926

qemu-img-10-15.svg

15 - 20

9472128(+1048832)

5708230

5693710

qemu-img-15-20.svg

20 - 25

9540992(+68864)

5723629

5712002

qemu-img-20-25.svg

25 - 30

9617024(+76032)

5721568

5708794

qemu-img-25-30.svg

30 - 35

11263040(+10646016)

5689537

5678589

qemu-img-30-35.svg

35 – 40

11263040(+0)

5722598

5709954

qemu-img-35-40.svg

40 - 45

13252160(+1989120)

5693774

5682506

qemu-img-40-45.svg

45 - 50

15272576(+12020416)

5698998

5686398

qemu-img-45-50.svg

50 - 55

15272704(128)

985500

983322

qemu-img-50-55.svg


Bulk of the write to the "-sda" file completed within 15-20 mins although sshd/qemu-img receives constant stream of data from the nbd-server
I'm not sure if this is because of "sparseness" (Not sure if the sparsifying is handled on the nbd-server side or on the receiving side)

The flamegraph also show not much of disk write happening after 20 mins. (mostly poll and recv_msg)

I am yet todo a similar experiment when running on the "host" to actually compare the behavior.

Does this provide any clues?

thanks
Suresh

On Wed, Apr 10, 2019 at 10:30 AM Richard W.M. Jones <rjones@redhat.com> wrote:
On Wed, Apr 10, 2019 at 10:15:43AM -0700, Sureshkumar Kaliannan wrote:
> thanks Richard,
>
> The experiment was indeed done with nested VM enabled. I am not sure about
> the internals, but i thought once overlay is setup the 2 main processes are
> sshd and qemu-img convert (reading data from sshd and doing the conversion)

Yes this should be true.  I wouldn't expect copying to be slower.

So initial guess probably wrong.  I wonder if there are some extra
steps such as a slow software bridge or openvswitch on the host?

> I don't see any of the qemu process running.

As per http://libguestfs.org/virt-p2v.1.html#how-virt-p2v-works you
should see qemu-nbd running on the physical server, and qemu-img
running on the conversion server during the copying phase.

> Initial overlay setup was pretty quick and rest of the time was spent in
> qemu-img convert operation

Indeed qemu should exit before copying starts.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/