Even running this with lots of extra debugging it seems to be doing
what is expected.  I can't see the bug if there is one.
Note I'm only testing qemu-img convert as that is what's relevant to
virt-v2v.  Previously I was using a hand-written test using libnbd
that did random reads across the disk, and showed a modest improvement
with 8 handles.
Here are the results across a range of pool sizes, all with -W -m 16:
1 handle:
real	1m8.031s
user	0m0.112s
sys	0m1.560s
2 handles:
real	1m6.465s
user	0m0.106s
sys	0m1.607s
4 handles:
real	1m21.488s
user	0m0.126s
sys	0m1.620s
8 handles:
real	1m27.790s
user	0m0.099s
sys	0m1.625s
16 handles:
real	1m33.974s
user	0m0.124s
sys	0m1.718s
I also tried matching the number of coroutines to the number
of handles in the pool (ie pool-max == -m):
1 coroutine, 1 handle:
real	     1m9.545s
user	     0m0.083s
sys	     0m1.592s
2 coroutines, 2 handles:
real	      1m6.130s
user	      0m0.078s
sys	      0m1.567s
4 coroutines, 4 handles:
real	      1m22.109s
user	      0m0.107s
sys	      0m1.690s
8 coroutines, 8 handles:
real	      1m26.490s
user	      0m0.108s
sys	      0m1.650s
16 coroutines, 16 handles (same as last result above):
real	1m33.974s
user	0m0.124s
sys	0m1.718s
Rich.
-- 
Richard Jones, Virtualization Group, Red Hat 
http://people.redhat.com/~rjones
Read my programming and virtualization blog: 
http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW