Re: [Libguestfs] [libnbd PATCH v3 05/29] vector: (mostly) factor out DEFINE_VECTOR_EMPTY

Tuesday, 21 February 2023

On 2/20/23 21:38, Eric Blake wrote:
...
 On Mon, Feb 20, 2023 at 06:03:13PM +0100, Laszlo Ersek wrote:
> On 2/15/23 21:27, Eric Blake wrote:
>> On Wed, Feb 15, 2023 at 03:11:34PM +0100, Laszlo Ersek wrote:
>>> The "name##_iter" function is used 11 times in libnbd; in all those
cases,
>>> "name" is "string_vector", and the "f" callback
is "free":
>>>
>>>   string_vector_iter (..., (void *) free);
>>>
>>> Casting "free" to (void*) is ugly. (Well-defined by POSIX, but
still.)
>>
>> Tangentially related: casting function pointers in general may get
>> harder as more compilers move towards C23 and its newer rules (see for
>> example
>> https://lists.gnu.org/archive/html/bug-gnulib/2023-02/msg00055.html or
>> this gcc 13 bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108694
>> which highlights some of the warnings that newer compilers will start
>> warning us about).  While this patch focuses on avoiding casts between
>> fn(*)(type*) and void*, I don't know off-hand if C23 will permit
>> fn(*)(void*) as a compatible function pointer with fn(*)(type).
>
> My understanding is that, per C99 at least, ret_type(*)(void*) is
> compatible with ret_type(*)(type*) precisely if void* is compatible with
> type* (6.7.5.3p15).
>
> Whether void* is compatible with type* depends on... ugh, that's hard to
> deduce from the standard. 6.7.5.1p2 says, "For two pointer types to be
> compatible, both shall be identically qualified and both shall
> be pointers to compatible types". I don't think "void" (as a type
in
> itself) is compatible with any type!
>
> Now, there is one particular statement on void* -- 6.2.5p27 says, "A
> pointer to void shall have the same representation and alignment
> requirements as a pointer to a character type."
>
> (I think the statements about *converting* void* to type* and vice versa
> do not apply here; AFAICT "compatibility" is about reinterpreting the
> bit patterns, not converting values.)
>
> In Annex I (Common warnings, "informative"), the following is listed:
> "An implicit narrowing conversion is encountered, such as the assignment
> of [...] a pointer to void to a pointer to any type other than a
> character type".
>
> All in all I don't think ret_type(*)(type*) is compatible with
> ret_type(*)(void*) in the general case, and that's why in this patch I
> didn't want to go more general than I absolutely needed to.

 Thanks for at least trying to find something definitive in the
 standard.  Now you know why I skipped researching this particular
 issue - it's not straightforward to figure out when function pointers
 with differing parameter types are compatible. 
Let's be honest: it's staggeringly difficult to collect whatever
"compatible type" means, in the standard.

...

>>
>> Thinking higher-level now, your new macro is something where we have
>> to do a two-step declaration of macro types where we want the new
>> function.  Would it make more sense to change the signature of the
>> DEFINE_VECTOR_TYPE() macro to take a third argument containing the
>> function name to call on cleanup paths, with the ability to easily
>> write/reuse a no-op function for vectors that don't need to call
>> free(), where we can then unconditionally declare name##_empty() that
>> will work with all vector types?  That is, should we consider instead
>> doing something like:
>>
>> DEFINE_VECTOR_TYPE (string_vector, char *, free);
>>
>> DEFINE_VECTOR_TYPE (int_vector, int, noop);
>
> My counter-arguments:
>
> - this requires updates to all existent DEFINE_VECTOR_TYPE macro
> invocations,
>
> - with "noop" passed to _reset, _reset and _empty become effectively the
> same, so we get (at least partially) duplicate APIs,
>
> - this would be a step towards combinatorial explosion
>
> - if "noop" does nothing, then why call it on each element of the vector
> anyway? It's not only the function call that becomes superfluous in the
> loop bodym with the function being "noop", but the loop *itself* becomes
> superfluous. So then we might want to compare the function pointer
> against "noop" outside of the loop... and that way we get a bunch of
> complications :)
>
> I chose this approach because it is purely additive and precisely as
> generic/specific as it needs to be. We already have 11 use cases, so I
> don't think it's *too* specific.

 We may still want some division of:

 DEFINE_VECTOR_TYPE (int_vector, int);
 DEFINE_POINTER_VECTOR_TYPE (string_vector, char *, free);

 where under the hood, DEFINE_POINTER_VECTOR_TYPE(type, base, fn)
 invokes both DEFINE_VECTOR_TYPE(type, base) and
 DEFINE_VECTOR_EMPTY(type, fn), or whatever we name the second
 function. 
This is doable, but I hope it's not expected that
DEFINE_POINTER_VECTOR_TYPE() *enforce* that the element type be a pointer :)

...

 From your other mail in this subthread:

>>> +#define DEFINE_VECTOR_EMPTY(name)                                      \
>>
>> I'm going to be that guy ...
>
> Yes, someone has to be! I knew it was virtually impossible for me to get
> the name right at the first try :)
>
>>
>> Can we call it ADD_VECTOR_EMPTY_METHOD or similar/better?
>
> Eric, any comments?

 ADD_VECTOR_EMPTY_METHOD() instead of DEFINE_VECTOR_EMPTY() works for
 me. 
OK, ADD_VECTOR_EMPTY_METHOD() can work with the above.

...
 Whether we hard-code 'free()' as the lone function that will
be
 utilized in the generated TYPE_vector_empty(), or if the macro takes
 fn as a parameter, is up to you. 
I prefer hard-coding free(). Once we expose the callback, we should
arguably -- for generality's sake! -- expose a context pointer too, one
that the callback receives invariantly alongside the element to free.
And then we're back to where we are now: we can't just pass "free",
because "free" does not take a context pointer.

...

 Comparing:

 DEFINE_VECTOR_TYPE(string_vector, char *);
 ADD_VECTOR_EMPTY_METHOD(string_vector);

 vs.

 DEFINE_VECTOR_TYPE(string_vector, char *);
 ADD_VECTOR_EMPTY_METHOD(string_vector, free);

 does either one make it more obvious that we are doing a two-step
 definition (first of the type, then of the added cleanup method)? 
I think both are equally functional, but the latter is at a worse spot
IMO on the "generality spectrum". It's too generic, yet not generic
enough. Too generic because we only need "free" for now, and not generic
enough because a truly generic callback would take an additional void*
context pointer.

...
 And
 if so, does my idea of a single wrapper function that calls both 
s/wrapper function/wrapper macro/

...
 intermediate macros make sense 
Sure.

In practical terms, the difference is whether I *add* a new macro
invocation to "common/utils/string-vector.h", or *replace* the existing
macro invocation there. I *think* the wrapper macro is slightly overkill
(we have 1 use for it!), but it's certainly doable.

...
 so that the 11 pointer vector types in
 libnbd are still one-liner macro calls, without penalizing the scalar
 vector types in nbdkit?

Yep.

Thanks!
Laszlo

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [libnbd PATCH v3 05/29] vector: (mostly) factor out DEFINE_VECTOR_EMPTY