On Tue, May 31, 2022 at 6:52 PM Eric Blake <eblake(a)redhat.com> wrote:
On Tue, May 31, 2022 at 10:49:03AM -0500, Eric Blake wrote:
> This patch fixes the corner-case regression introduced by the previous
> patch where the pread_structured callback buffer lifetime ends as soon
> as the callback (that is, where accessing a stashed callback parameter
> can result in ValueError instead of modifying a harmless copy). With
> careful effort, we can create a memoryview of the Python object that
> we read into, then slice that memoryview as part of the callback; now
> the slice will be valid as long as the original object is also valid.
>
> | @@ -76,8 +77,24 @@ chunk_wrapper (void *user_data, const vo
> | PyObject *py_subbuf = NULL;
> | PyObject *py_error = NULL;
> |
> | - /* Cast away const */
> | - py_subbuf = PyMemoryView_FromMemory ((char *) subbuf, count, PyBUF_READ);
> | + if (data->buf) {
In which case we don't have data->buf?
> | + PyObject *start, *end, *slice;
> | + assert (PyMemoryView_Check (data->buf));
Why do we need data->buf to be a memoryview?
We can create a new memoryview form data->buf - it only needs to be an object
supporting the buffer protocol. Basically we need the C version of:
memoryview(buf)[start:stop]
buf can be bytes, bytearray, mmap, or another memoryview.
> | + ptrdiff_t offs = subbuf - PyMemoryView_GET_BUFFER
(data->buf)->buf;
Because PyMemoryView_Check is inside the assert, build with NDEBUG will
remove the check, and this call may crash if data->buf is not a memoryview.
It would be nicer if we could get the offset without looking into the internal
buffer of the memoryview.
> | + start = PyLong_FromLong (offs);
> | + if (!start) { PyErr_PrintEx (0); goto out; }
> | + end = PyLong_FromLong (offs + count);
> | + if (!end) { Py_DECREF (start); PyErr_PrintEx (0); goto out; }
> | + slice = PySlice_New (start, end, NULL);
> | + Py_DECREF (start);
> | + Py_DECREF (end);
> | + if (!slice) { PyErr_PrintEx (0); goto out; }
> | + py_subbuf = PyObject_GetItem (data->buf, slice);
Missing a Py_DECREF (slice) here. Easy enough to add...
> +++ b/generator/Python.ml
> @@ -187,8 +187,24 @@ let
> pr " PyList_SET_ITEM (py_%s, i_%s, py_e_%s);\n" n n n;
> pr " }\n"
> | CBBytesIn (n, len) ->
> - pr " /* Cast away const */\n";
> - pr " py_%s = PyMemoryView_FromMemory ((char *) %s, %s,
PyBUF_READ);\n" n n len;
> + pr " if (data->buf) {\n";
> + pr " PyObject *start, *end, *slice;\n";
> + pr " assert (PyMemoryView_Check (data->buf));\n";
> + pr " ptrdiff_t offs = %s - PyMemoryView_GET_BUFFER
(data->buf)->buf;\n" n;
> + pr " start = PyLong_FromLong (offs);\n";
> + pr " if (!start) { PyErr_PrintEx (0); goto out; }\n";
> + pr " end = PyLong_FromLong (offs + %s);\n" len;
> + pr " if (!end) { Py_DECREF (start); PyErr_PrintEx (0); goto out;
}\n";
> + pr " slice = PySlice_New (start, end, NULL);\n";
> + pr " Py_DECREF (start);\n";
> + pr " Py_DECREF (end);\n";
> + pr " if (!slice) { PyErr_PrintEx (0); goto out; }\n";
> + pr " py_%s = PyObject_GetItem (data->buf, slice);\n" n;
...here
And I really wish python didn't make it so hard to grab a slice of
another object using C code. Having to create 3 temporary PyObjects
instead of having a utility C function that takes normal integers was
annoying.
Does this work?
PySlice_New(NULL, NULL, NULL);
PySlice_AdjustIndices(length, start, stop, step);
Nir