On 5/8/23 23:33, Richard W.M. Jones wrote:
On Mon, May 08, 2023 at 08:54:48AM +0200, Laszlo Ersek wrote:
> On 5/7/23 12:43, Richard W.M. Jones wrote:
>> This is not secure so should not be used routinely. Also we do not
>> attempt to filter environment variables, so even ones containing
>> multiple lines or special characters are all sent to nbdkit_debug.
>>
>> The reason for adding this is to allow for debugging the new
>> nbd_set_socket_activation_name(3) API added in libnbd 1.16.
>> ---
>> docs/nbdkit.pod | 10 ++++++++++
>> server/main.c | 28 ++++++++++++++++++++++++++++
>> 2 files changed, 38 insertions(+)
>>
>> diff --git a/docs/nbdkit.pod b/docs/nbdkit.pod
>> index 83cf6a11c..22350cc08 100644
>> --- a/docs/nbdkit.pod
>> +++ b/docs/nbdkit.pod
>> @@ -619,6 +619,16 @@ are numerous and not usually very interesting.
>> S<I<-D nbdkit.backend.controlpath=0>> suppresses the non-datapath
>> commands (config, open, close, can_write, etc.)
>>
>> +=item B<-D nbdkit.environ=1>
>> +
>> +(not Windows)
>> +
>> +Print nbdkit's environment variables in the debug output at start up.
>> +This is insecure because environment variables may contain both
>> +sensitive and user-controlled information, so it should not be used
>> +routinely. But it is useful for tracking down problems related to
>> +environment variables.
>> +
>> =item B<-D nbdkit.tls.log=>N
>>
>> Enable TLS logging. C<N> can be in the range 0 (no logging) to 99.
>> diff --git a/server/main.c b/server/main.c
>> index 1df5d69ac..7b6adc3cc 100644
>> --- a/server/main.c
>> +++ b/server/main.c
>> @@ -210,6 +210,26 @@ dump_config (void)
>> #endif
>> }
>>
>> +#ifndef WIN32
>> +
>> +/* -D nbdkit.environ=1 to dump the environment at start up. */
>> +NBDKIT_DLL_PUBLIC int nbdkit_debug_environ;
>> +
>> +#ifndef HAVE_ENVIRON_DECL
>> +extern char **environ;
>> +#endif
>> +
>> +static void
>> +dump_environment (void)
>> +{
>> + size_t i;
>> +
>> + for (i = 0; environ[i]; ++i)
>> + nbdkit_debug ("%s", environ[i]);
>> +}
>> +
>> +#endif /* !WIN32 */
>> +
>> int
>> main (int argc, char *argv[])
>> {
>> @@ -662,6 +682,14 @@ main (int argc, char *argv[])
>> /* Check all debug flags were used, and free them. */
>> free_debug_flags ();
>>
>> +#ifndef WIN32
>> + /* Dump the environment if asked. This is the earliest we can do it
>> + * because it uses a debug flag.
>> + */
>> + if (nbdkit_debug_environ && verbose)
>> + dump_environment ();
>> +#endif
>> +
>> if (help) {
>> struct backend *b;
>>
>
> I was surprised not to see any assignment to the new global variable,
> but then I found "server/debug-flags.c"; that's some serious wizardry.
With the side effect, as it turns out, that it's very hard to
implement for language plugins. Rust is fine. OCaml can be done by
adding a C object to the link. Other languages, not really :-(
Hm. I didn't even realize it was being done like this for plugins' sake.
I assumed it was there to avoid boilerplate code.
I figure an alternative would be for -D to stash the flags in a
dictionary (or list) in the main program, and then each plugin / filter
could access that data structure (or even get a reference to it in (one
of) the entry point function(s)).
IIRC, if you build the main program with -rdynamic, then dynamically
loaded objects can use the main program for resolving their symbols...
Indeed, nbdkit already seems to use -rdynamic; at least "configure.ac"
checks for it.
> Side thought: should we not make both "nbdkit_debug_environ" (and
> friends), and the target of the "sym" pointer in
"apply_debug_flags",
> volatile? Aliasing doesn't get much more obscure than this, and I wonder
> if this has the potential to blow up later, when compilers (and
> processors) get even more clever.
>
> Of course POSIX could always say "this is just required to work", which
> I guess would be fine, but currently it doesn't seem to say it.
I guess Eric will know for sure, but isn't it generally true that all
global variables can be modified by any function call? My model is
that they can modified at any time by any called function; even
through a pointer so not really visible to the compiler.
Well to express it with some hand-waving, modifying globals through
pointers are of course fine, but it's about "pointer provenance". The
dlsym() use here isn't much different from reading a pointer from a
FILE* with fscanf("%p"), casting the result to int*, then de-referencing
the result. Collecting the addresses of global variables in a list and
passing the list to another function (even in a dl-opened object) is
different.
Just because the object is a global variable, concerns are not
immmediately removed. For example, per POSIX, if you get a signal
delivered (truly) asynchronously, you can only mostly assign a value to
a "volatile sig_atomic_t" global variable in the signal catching
function. (I'm not being exact here, see the spec for details.) The spec
doesn't explain (as far as I understand), but the "sig_atomic_t" type is
required so that no matter at what instruciton boundary the signal is
delivered, the variable can never be modified halfway, and the
"volatile" is there for the same visibility concern that I'm raising here.
Compilers have been getting more aggressive in exploiting the "strict
aliasing" rules; that was all I wanted to raise.
I'll comment some more under Eric's response.
Thanks!
Laszlo
> Anyway:
>
> Acked-by: Laszlo Ersek <lersek(a)redhat.com>
>
> Thanks for writing the patch!
> Laszlo
Rich.