On Mon, Sep 10, 2018 at 10:42:59AM -0500, Eric Blake wrote:
On 9/10/18 10:27 AM, Richard W.M. Jones wrote:
>Previously key=value on the command line allowed the key to be pretty
>much anything that didn't contain an '=' character. Even empty
>strings were permitted.
>
>This tightens up the permitted keys so they must contain only ASCII
>alphanumeric, period, underscore or dash characters; must not be an
>empty string; and must start with an ASCII alphabetic character.
>---
> docs/nbdkit-plugin.pod | 24 +++++++++++++++---------
> src/main.c | 36 +++++++++++++++++++++++++++++++++++-
> 2 files changed, 50 insertions(+), 10 deletions(-)
>
>+Both C<key> and C<value> parameters will be non-NULL.
>+
>+They key will be a non-empty string beginning with an ASCII alphabetic
s/They/The/
>+character (C<A-Z> C<a-z>). The rest of the key must contain only
>+ASCII alphanumeric plus period, underscore or dash characters (C<A-Z>
>+C<a-z> C<0-9> C<.> C<_> C<->).
No leading underscore? I'm okay with that, but it is slightly
stricter than C and Shell variables (then again, accepting '.' and
'-' already makes us different). And we can always add stuff later
if it proves useful (it's easier to start strict and relax later,
compared to starting relaxed and then tightening down when we
realize the problems it caused).
>+/* When parsing plugin and filter config key=value from the command
>+ * line, check that the key is a simple alphanumeric with period,
>+ * underscore or dash.
>+ *
>+ * Note this doesn't return an error. If the key is not valid then we
>+ * return false and the parsing code will assume that this is a bare
>+ * value instead.
>+ */
>+static int
>+is_config_key (const char *key, size_t len)
>+{
>+ static const char allowed_first[] =
>+ "abcdefghijklmnopqrstuvwxyz"
>+ "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
>+ static const char allowed[] =
>+ "abcdefghijklmnopqrstuvwxyz"
>+ "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
>+ "0123456789"
>+ "._-";
Crazy me is thinking of the micro-optimization:
static const char allowed[] =
"0123456789"
".-_"
"abc..z"
"ABC..Z";
True! But ...
...
>+
>+ if (len == 0)
>+ return 0;
>+
>+ if (strchr (allowed_first, key[0]) == NULL)
>+ return 0;
where this is: 'strchr(allowed + 13, key[0])' (or allowed + 12, if
you allow leading underscore). But please don't - while it's
clever, I don't think it's friendly to future readers, and the
optimization is not in hot code, nor is shaving 52 bytes of .data by
avoiding repetition between the two 'allowed' constants really a
deal-breaker.
Interestingly I thought that GCC optimized this automatically,
combining strings with common suffixes. However when I reordered the
strings as suggested above to:
static const char allowed_first[] =
"abcdefghijklmnopqrstuvwxyz"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ";
static const char allowed[] =
"._-"
"0123456789"
"abcdefghijklmnopqrstuvwxyz"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ";
I didn't observe GCC or the linker combining these strings
(gcc-8.2.1-2.fc29.x86_64).
It certainly _used_ to do that last time I hacked on GCC. But that
was in, erm, 1995, so I suppose it's not surprising this has changed :-)
>+
>+ /* This works in context of the caller since key[len] == '='. */
>+ if (strspn (key, allowed) != len)
>+ return 0;
I like it. With the one typo fix,
Reviewed-by: Eric Blake <eblake(a)redhat.com>
Thanks, I'll push this series with the fix.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v