On 2/17/23 19:36, Eric Blake wrote:
On Wed, Feb 15, 2023 at 04:39:35PM +0000, Richard W.M. Jones wrote:
> On Wed, Feb 15, 2023 at 03:11:41PM +0100, Laszlo Ersek wrote:
>> prepare_socket_activation_environment() is a construction function that is
>> supposed to fill in a string_vector object from the ground up. Right now
>> it has its responsibilities mixed up in two ways:
>>
>> - it expects the caller to pass in a previously re-set string_vector,
>>
>> - if it fails, it calls set_error() internally (with a blanket reference
>> to "malloc").
>>
>> Fix both warts:
>>
>> - pass in an *uninitialized* (only allocated) string vector from the
>> caller, and initialize it in prepare_socket_activation_environment(),
>>
>> - move the set_error() call out to the caller.
>>
>> Signed-off-by: Laszlo Ersek <lersek(a)redhat.com>
>> ---
>> generator/states-connect-socket-activation.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/generator/states-connect-socket-activation.c
b/generator/states-connect-socket-activation.c
>> index c46a0bf5c0a3..b5e146539cc8 100644
>> --- a/generator/states-connect-socket-activation.c
>> +++ b/generator/states-connect-socket-activation.c
>> @@ -51,7 +51,7 @@ prepare_socket_activation_environment (string_vector *env)
>> char *p;
>> size_t i;
>>
>> - assert (env->len == 0);
>> + *env = (string_vector)empty_vector;
>
> Do you actually need to cast this?
Elsewhere in the code, we overwhelmingly do not use the cast. C++
might require it, but we're using C.
No, that's not the right explanation. Elsewhere in the code, we do not
use *assignments* for prepping our vector structures with
"empty_vector". Instead, everywhere else in the code, we use
*initialization*.
Those are different contexts.
If you expand the "empty_vector" macro in the below:
string_vector v1 = empty_vector;
string_vector v2;
v2 = empty_vector;
we get:
string_vector v1 = { .ptr = NULL, .len = 0, .cap = 0 };
string_vector v2;
v2 = { .ptr = NULL, .len = 0, .cap = 0 };
For "v1", we have a definition that is also an initialization, so the
initializer
{ .ptr = NULL, .len = 0, .cap = 0 }
is fine there.
For "v2", we have a definition that is not an initialization, and we
have an additional *assignment*. For the right hand side of the
assignment, the initializer
{ .ptr = NULL, .len = 0, .cap = 0 }
is *not* fine. However, if we write:
v2 = (string_vector){ .ptr = NULL, .len = 0, .cap = 0 };
then the RHS of the assignment becomes a C99 *compound literal*, which
the initializer
{ .ptr = NULL, .len = 0, .cap = 0 }
is a *proper part of*. The compound literal
(string_vector){ .ptr = NULL, .len = 0, .cap = 0 };
is a nameless object with automatic storage duration. It's something like:
string_vector v2;
string_vector nameless_empty = { .ptr = NULL, .len = 0, .cap = 0 };
v2 = nameless_empty;
... I have a few more comments here:
- I generally prefer to *avoid* initialization (especially
initialization where the initializer is complex, like a function call!),
and use assignments that are performed as late (i.e., as narrowly) as
possible. In my opinion, this conveys information about data flow. (I
don't like arguments such as "let's initialize this just to be sure" --
we should *know* when a variable is first consumed!)
Anyway, in the present case, this argument of mine is not relevant,
because the object in question (*env) already exists, we're not
*defining* it in the first place (~ definition is such a declaration
that allocates space), so we cannot initialize it either (in the C
language sense); we can only assign another structure to it, or access
its fields, or use memset().
C code that predates C99 would certainly use individual field accesses:
env->ptr = NULL;
env->len = 0;
env->cap = 0;
or call memset:
memset (env, 0, sizeof *env);
or use a small "initializer function" that would thinly wrap one of the
above variants.
However, in C99, we have compound literals, and that's a great way for
reusing the "empty_vector" macro in a (structure) *assignment*.
>
>> /* Reserve slots env[0] and env[1]. */
>> p = strdup ("LISTEN_PID=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx");
>> @@ -90,7 +90,6 @@ prepare_socket_activation_environment (string_vector *env)
>> return 0;
>>
>> err:
>> - set_error (errno, "malloc");
On a rough level, this was correct but unspecific (we only expect
failure due to memory allocations, but didn't do the allocation
locally and don't know which function allocated).
I agree, to an extent. The particular "malloc" string literal is
somewhat inexact at this point, but semantically OK. It will become more
inexact only later.
However, my main point here is that this set_error() call does not
belong in the callee. It belongs in the caller. First because the callee
does lots of things (and will do more). Second, because the caller
already has explicit set_error() calls on all other error branches.
>> string_vector_empty (env);
>> return -1;
>> }
>> @@ -99,7 +98,7 @@ STATE_MACHINE {
>> CONNECT_SA.START:
>> int s;
>> struct sockaddr_un addr;
>> - string_vector env = empty_vector;
>> + string_vector env;
>> pid_t pid;
>>
>> assert (!h->sock);
>> @@ -156,6 +155,7 @@ CONNECT_SA.START:
>>
>> if (prepare_socket_activation_environment (&env) == -1) {
>> SET_NEXT_STATE (%.DEAD);
>> + set_error (errno, "prepare_socket_activation_environment");
>
> Why move this out of the function?
Moving it here lets us give a more specific message about a function
at a different layer in the stack.
Yes, that's it, basically.
Most memory failures are already
going to be a pain where we don't know if the caller will ever get
back a desired error message, so being inspecific doesn't necessarily
hurt. But I also don't see any technical reasons to avoid this patch.
Reviewed-by: Eric Blake <eblake(a)redhat.com>
Thanks,
Laszlo