On Fri, May 27, 2022 at 04:44:16PM -0500, Eric Blake wrote:
Instead of malloc'ing a C buffer, nbd_pread()ing into it, then
copying
it into an immutable Python bytes object, we can instead pre-create a
correctly-sized Python bytearray object, then nbd_pread() directly
into that object's underlying bytes.
While the data copying might not be the bottleneck compared to the
networking costs, it does have noticeable results; on my machine:
$ export script='
m=1024*1024
size=h.get_size()
for i in range(size // m):
buf = h.pread(m, m*i)
'
$ time ./run nbdkit -U - pattern 10G --run 'nbdsh -u $uri -c
"$script"'
sped up from about 7.8s pre-patch to about 7.1s post-patch,
approximately a 10% speedup.
Note that this slightly changes the python API: h.pread[_structured]
now returns a mutable 'bytearray' object, rather than an immutable
'bytes' object. This makes it possible to modify the just-read string
in place, rather than having to create yet another memory buffer for
any modifications, which offers even more speedups when writing a
read-modify-write paradigm in python. But the change is
backwards-compatible - python already states that a read-write buffer
may be returned instead of readonly for any client that already
operated only on a buffer in a readonly manner.
---
Note that h.pread is more like Python read() semantics in creating a
buffer, while h.aio_pread is more like Python readinto() semantics in
modifying a passed-in buffer. But now that both code paths have a
python object prior to calling into the C API, my next task is to
improve the h.*pread_structured callback function to pass its buffer
as a slice of the Python input buffer, rather than doing yet another
round of pointless memcpy from C into python objects.
generator/Python.ml | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
I've been previously reminded (commit e0953c for example) that showing
the diff of the generated output is also useful:
--- python/methods.c.bak 2022-05-27 17:08:43.035658709 -0500
+++ python/methods.c 2022-05-27 17:08:50.678669864 -0500
@@ -2295,7 +2295,7 @@
struct nbd_handle *h;
int ret;
PyObject *py_ret = NULL;
- char *buf = NULL;
+ PyObject *buf = NULL;
Py_ssize_t count;
uint64_t offset_u64;
unsigned long long offset; /* really uint64_t */
@@ -2309,19 +2309,20 @@
h = get_handle (py_h);
if (!h) goto out;
flags_u32 = flags;
- buf = malloc (count);
- if (buf == NULL) { PyErr_NoMemory (); goto out; }
+ buf = PyByteArray_FromStringAndSize (NULL, count);
+ if (buf == NULL) goto out;
offset_u64 = offset;
- ret = nbd_pread (h, buf, count, offset_u64, flags_u32);
+ ret = nbd_pread (h, PyByteArray_AS_STRING (buf), count, offset_u64, flags_u32);
if (ret == -1) {
raise_exception ();
goto out;
}
- py_ret = PyBytes_FromStringAndSize (buf, count);
+ py_ret = buf;
+ buf = NULL;
out:
- free (buf);
+ Py_XDECREF (buf);
return py_ret;
}
@@ -2332,7 +2333,7 @@
struct nbd_handle *h;
int ret;
PyObject *py_ret = NULL;
- char *buf = NULL;
+ PyObject *buf = NULL;
Py_ssize_t count;
uint64_t offset_u64;
unsigned long long offset; /* really uint64_t */
@@ -2350,8 +2351,8 @@
h = get_handle (py_h);
if (!h) goto out;
flags_u32 = flags;
- buf = malloc (count);
- if (buf == NULL) { PyErr_NoMemory (); goto out; }
+ buf = PyByteArray_FromStringAndSize (NULL, count);
+ if (buf == NULL) goto out;
offset_u64 = offset;
chunk.user_data = chunk_user_data = alloc_user_data ();
if (chunk_user_data == NULL) goto out;
@@ -2364,16 +2365,17 @@
Py_INCREF (py_chunk_fn);
chunk_user_data->fn = py_chunk_fn;
- ret = nbd_pread_structured (h, buf, count, offset_u64, chunk, flags_u32);
+ ret = nbd_pread_structured (h, PyByteArray_AS_STRING (buf), count, offset_u64, chunk,
flags_u32);
chunk_user_data = NULL;
if (ret == -1) {
raise_exception ();
goto out;
}
- py_ret = PyBytes_FromStringAndSize (buf, count);
+ py_ret = buf;
+ buf = NULL;
out:
- free (buf);
+ Py_XDECREF (buf);
free_user_data (chunk_user_data);
return py_ret;
}
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization:
qemu.org |
libvirt.org