Re: [Libguestfs] [libnbd PATCH] python: Speed up pread

Friday, 27 May 2022

On Fri, May 27, 2022 at 04:44:16PM -0500, Eric Blake wrote:
...
 Instead of malloc'ing a C buffer, nbd_pread()ing into it, then
copying
 it into an immutable Python bytes object, we can instead pre-create a
 correctly-sized Python bytearray object, then nbd_pread() directly
 into that object's underlying bytes.
 
 While the data copying might not be the bottleneck compared to the
 networking costs, it does have noticeable results; on my machine:
 
 $ export script='
 m=1024*1024
 size=h.get_size()
 for i in range(size // m):
   buf = h.pread(m, m*i)
 '
 $ time ./run nbdkit -U - pattern 10G --run 'nbdsh -u $uri -c
"$script"'
 
 sped up from about 7.8s pre-patch to about 7.1s post-patch,
 approximately a 10% speedup.
 
 Note that this slightly changes the python API: h.pread[_structured]
 now returns a mutable 'bytearray' object, rather than an immutable
 'bytes' object.  This makes it possible to modify the just-read string
 in place, rather than having to create yet another memory buffer for
 any modifications, which offers even more speedups when writing a
 read-modify-write paradigm in python.  But the change is
 backwards-compatible - python already states that a read-write buffer
 may be returned instead of readonly for any client that already
 operated only on a buffer in a readonly manner.
 ---
 
 Note that h.pread is more like Python read() semantics in creating a
 buffer, while h.aio_pread is more like Python readinto() semantics in
 modifying a passed-in buffer.  But now that both code paths have a
 python object prior to calling into the C API, my next task is to
 improve the h.*pread_structured callback function to pass its buffer
 as a slice of the Python input buffer, rather than doing yet another
 round of pointless memcpy from C into python objects.
 
  generator/Python.ml | 17 +++++++++--------
  1 file changed, 9 insertions(+), 8 deletions(-) 
I've been previously reminded (commit e0953c for example) that showing
the diff of the generated output is also useful:

--- python/methods.c.bak	2022-05-27 17:08:43.035658709 -0500
+++ python/methods.c	2022-05-27 17:08:50.678669864 -0500
@@ -2295,7 +2295,7 @@
   struct nbd_handle *h;
   int ret;
   PyObject *py_ret = NULL;
-  char *buf = NULL;
+  PyObject *buf = NULL;
   Py_ssize_t count;
   uint64_t offset_u64;
   unsigned long long offset; /* really uint64_t */
@@ -2309,19 +2309,20 @@
   h = get_handle (py_h);
   if (!h) goto out;
   flags_u32 = flags;
-  buf = malloc (count);
-  if (buf == NULL) { PyErr_NoMemory (); goto out; }
+  buf = PyByteArray_FromStringAndSize (NULL, count);
+  if (buf == NULL) goto out;
   offset_u64 = offset;
 
-  ret = nbd_pread (h, buf, count, offset_u64, flags_u32);
+  ret = nbd_pread (h, PyByteArray_AS_STRING (buf), count, offset_u64, flags_u32);
   if (ret == -1) {
     raise_exception ();
     goto out;
   }
-  py_ret = PyBytes_FromStringAndSize (buf, count);
+  py_ret = buf;
+  buf = NULL;
 
  out:
-  free (buf);
+  Py_XDECREF (buf);
   return py_ret;
 }
 
@@ -2332,7 +2333,7 @@
   struct nbd_handle *h;
   int ret;
   PyObject *py_ret = NULL;
-  char *buf = NULL;
+  PyObject *buf = NULL;
   Py_ssize_t count;
   uint64_t offset_u64;
   unsigned long long offset; /* really uint64_t */
@@ -2350,8 +2351,8 @@
   h = get_handle (py_h);
   if (!h) goto out;
   flags_u32 = flags;
-  buf = malloc (count);
-  if (buf == NULL) { PyErr_NoMemory (); goto out; }
+  buf = PyByteArray_FromStringAndSize (NULL, count);
+  if (buf == NULL) goto out;
   offset_u64 = offset;
   chunk.user_data = chunk_user_data = alloc_user_data ();
   if (chunk_user_data == NULL) goto out;
@@ -2364,16 +2365,17 @@
   Py_INCREF (py_chunk_fn);
   chunk_user_data->fn = py_chunk_fn;
 
-  ret = nbd_pread_structured (h, buf, count, offset_u64, chunk, flags_u32);
+  ret = nbd_pread_structured (h, PyByteArray_AS_STRING (buf), count, offset_u64, chunk,
flags_u32);
   chunk_user_data = NULL;
   if (ret == -1) {
     raise_exception ();
     goto out;
   }
-  py_ret = PyBytes_FromStringAndSize (buf, count);
+  py_ret = buf;
+  buf = NULL;
 
  out:
-  free (buf);
+  Py_XDECREF (buf);
   free_user_data (chunk_user_data);
   return py_ret;
 }



-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

    

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [libnbd PATCH] python: Speed up pread