On 02/01/22 09:35, Richard W.M. Jones wrote:
On Tue, Feb 01, 2022 at 08:22:47AM +0100, Laszlo Ersek wrote:
> On 01/31/22 17:14, Richard W.M. Jones wrote:
>>
>> Thanks - I pushed this as:
>>
>> 521e69f7..24217480 master -> master
>>
>> I changed patch 3 to include your recommendations. But also I noticed
>> there was a mistake in the memcpy which I hope I fixed and it's worth
>> reading the comment about this:
>>
>>
https://gitlab.com/nbdkit/nbdkit/-/blob/2421748073b71ab600db50f748a53b8f3...
>
>
> /* How much to copy here?
> *
> * vector_reserve always makes the buffer larger so we don't have to
> * deal with the case of a shrinking buffer.
> *
> * Like the underlying realloc we do not have to zero the newly
> * reserved elements.
> *
> * However (like realloc) we have to copy the existing elements even
> * those that are not allocated and only reserved (between 'len' and
> * 'cap').
>
> That's strange; I'd say any " offset >= len" access to a vector
is an out-of-bounds bug in the client code. From "common/utils/vector.h":
>
> * where ‘names.ptr[]’ will be an array of strings and ‘names.len’
> * will be the number of strings. There are no get/set accessors. To
> * iterate over the strings you can use the ‘.ptr’ field directly:
> *
> * for (size_t i = 0; i < names.len; ++i)
> * printf ("%s\n", names.ptr[i]);
>
> Then,
>
> *
> * The spec of vector is not clear on the last two points, but
> * existing code depends on this undocumented behaviour.
> */
>
> To me the spec ("common/utils/vector.h") seemed clear:
>
> /* Reserve n elements at the end of the vector. Note space is \
> * allocated and capacity is increased, but the vector length \
> * is not increased and the new elements are not initialized. \
> */ \
>
> If length is not increased, then elements between "len" and "old
> cap" remain undefined as before, and elements between "old cap" and
> "new cap" are new, and also not initialized.
That's what the spec says, but unfortunately not what the users of it
are doing. For example, common/allocators/malloc.c uses
vector_reserve to reserve memory and then memsets it and uses it.
That's because there isn't an "allocate N x zero elements" operation,
which there probably should be. plugins/data/format.c is similar
although it does update the .len field after.
TBH it's arguable both ways. Vector is only meant as a nicer wrapper
around realloc, so you can argue that what realloc does is the spec.
It's also intended that the implementation is completely exposed.
This is only an internal interface after all.
> I'm slightly concerned about accessing uninitialized (only
> allocated) storage with memcpy(); maybe it can be shown to be
> undefined behavior, maybe not, but I compilers have been exploiting
> UB more aggressively over time.
In both cases above the callers memset/initialize it before use.