Less copying is always better. But I was quite surprised by some of
my test cases in trying to prove that I had speedups; there's a huge
difference between:
for i in range(size // m):
buf = h.pread_structured(m, m*i, f)
and
for i in range(size // m):
buf = h.pread_structured(m, m*i, f)
buf = None
The former takes around 4.5s with libnbd 1.12.3, the latter around
15s, even though I was expecting the latter to be faster. But with
more thought, I think what is happening is that Python's compilation
and garbage collection implementation looks for opportunities where it
can prove that a pool of allocated memory can be easily reused
(assigning buf to hold a new bytearray that occupies the same size as
the previous bytearray that is now down to zero references) vs. where
it allocates new memory (the intermediate state of buf going to None
throwing away the intermediate knowledge about relations between its
other values). But with either python script, this patch makes a
noticeable difference by exposing the subbuf in [aio_]pread_structured
as a slice of the original python object, rather than yet another
memory copy.
Eric Blake (5):
python: Simplify passing of mutable *error to callbacks
python: Alter lock for persistent buffer
python: Accept all buffer-like objects in aio_p{read,write}
python: Support len(nbd.Buffer(n))
python: Slice structured read callback buffer from original
generator/Python.ml | 54 ++++++++---------
python/handle.c | 19 ++++--
python/t/405-pread-structured.py | 95 +++++++++---------------------
python/t/500-aio-pread.py | 21 +++++++
python/t/505-aio-pread-callback.py | 93 ++++++++---------------------
python/t/510-aio-pwrite.py | 17 ++++++
python/t/580-aio-is-zero.py | 2 +
python/utils.c | 53 +++++++++++++++++
8 files changed, 186 insertions(+), 168 deletions(-)
--
2.36.1