On 5/18/23 17:59, Richard W.M. Jones wrote:
On Thu, May 18, 2023 at 05:42:03PM +0200, Laszlo Ersek wrote:
> On 5/18/23 15:42, Richard W.M. Jones wrote:
>> On Thu, May 18, 2023 at 03:09:42PM +0200, Laszlo Ersek wrote:
>>> Libguestfs uses the
>>>
>>> /dev/VG/LV
>>>
>>> format internally, for identifying LVM logical volumes, but the user might
>>> want to specify the
>>>
>>> /dev/mapper/VG-LV ID
>>>
>>> format with the "--key ID:..." options.
>>>
>>> Introduce unescape_device_mapper_lvm() for turning
>>>
>>> /dev/mapper/VG-LV
>>>
>>> key IDs into
>>>
>>> /dev/VG/LV
>>>
>>> ones, unescaping doubled hyphens to single hyphens in both VG and LV in
>>> the process.
>>>
>>> Call unescape_device_mapper_lvm() from key_store_import_key(). That is,
>>> translate the ID as soon as the "--key" option is processed -- let
the
>>> keystore only know about the usual /dev/VG/LV format, for later matching.
>>>
>>> Bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=2168506
>>> Signed-off-by: Laszlo Ersek <lersek(a)redhat.com>
>>> ---
>>>
>>> Notes:
>>> v2:
>>>
>>> - rewrite without regular expressions [Rich, Laszlo]
>>>
>>> - restructure the commit message
>>>
>>> v1:
>>>
>>> regcomp() must definitely allocate memory dynamically, and we
>>> (intentionally) never free that -- we never call regfree(). I assume
>>> valgrind will catch this as a leak, so we might have to extend
>>> "valgrind-suppressions" in each dependent superproject.
However, I'm
>>> unable to run "make check-valgrind" successfully in e.g.
virt-v2v even
>>> before these patches; see also
>>>
<
https://listman.redhat.com/archives/libguestfs/2023-May/031496.html>.
>>>
>>> options/keys.c | 100 ++++++++++++++++++++
>>> 1 file changed, 100 insertions(+)
>>>
>>> diff --git a/options/keys.c b/options/keys.c
>>> index bc7459749276..52b27369016a 100644
>>> --- a/options/keys.c
>>> +++ b/options/keys.c
>>> @@ -260,6 +260,105 @@ key_store_add_from_selector (struct key_store *ks,
const char *selector)
>>> return key_store_import_key (ks, &key);
>>> }
>>>
>>> +/* Turn /dev/mapper/VG-LV into /dev/VG/LV, in-place. */
>>> +static void
>>> +unescape_device_mapper_lvm (char *id)
>>> +{
>>> + static const char dev[] = "/dev/", dev_mapper[] =
"/dev/mapper/";
>>> + const char *input_start;
>>> + char *output;
>>> + enum { M_SCAN, M_FILL, M_DONE } mode;
>>> +
>>> + if (!STRPREFIX (id, dev_mapper))
>>> + return;
>>> +
>>> + /* Start parsing "VG-LV" from "id" after
"/dev/mapper/". */
>>> + input_start = id + (sizeof dev_mapper - 1);
>>> +
>>> + /* Start writing the unescaped "VG/LV" output after
"/dev/". */
>>> + output = id + (sizeof dev - 1);
>>> +
>>> + for (mode = M_SCAN; mode < M_DONE; ++mode) {
>>> + char c;
>>> + const char *input = input_start;
>>> + const char *hyphen_buffered = NULL;
>>> + bool single_hyphen_seen = false;
>>> +
>>> + do {
>>> + c = *input;
>>> +
>>> + switch (c) {
>>> + case '-':
>>> + if (hyphen_buffered == NULL)
>>> + /* This hyphen may start an escaped hyphen, or it could be the
>>> + * separator in VG-LV.
>>> + */
>>> + hyphen_buffered = input;
>>> + else {
>>> + /* This hyphen completes an escaped hyphen; unescape it. */
>>> + if (mode == M_FILL)
>>> + *output++ = '-';
>>> + hyphen_buffered = NULL;
>>> + }
>>> + break;
>>> +
>>> + case '/':
>>> + /* Slash characters are forbidden in VG-LV anywhere. If there's
any,
>>> + * we'll find it in the first (i.e., scanning) phase, before we
output
>>> + * anything back to "id".
>>> + */
>>> + assert (mode == M_SCAN);
>>> + return;
>>> +
>>> + default:
>>> + /* Encountered a non-slash, non-hyphen character -- which also may
be
>>> + * the terminating NUL.
>>> + */
>>> + if (hyphen_buffered != NULL) {
>>> + /* The non-hyphen character comes after a buffered hyphen, so the
>>> + * buffered hyphen is supposed to be the single hyphen that
separates
>>> + * VG from LV in VG-LV. There are three requirements for this
>>> + * separator: (a) it must be unique (we must not have seen
another
>>> + * such separator earlier), (b) it must not be at the start of
VG-LV
>>> + * (because VG would be empty that way), (c) it must not be at the
end
>>> + * of VG-LV (because LV would be empty that way). Should any of
these
>>> + * be violated, we'll catch that during the first (i.e.,
scanning)
>>> + * phase, before modifying "id".
>>> + */
>>> + if (single_hyphen_seen || hyphen_buffered == input_start ||
>>> + c == '\0') {
>>> + assert (mode == M_SCAN);
>>> + return;
>>> + }
>>> +
>>> + /* Translate the separator hyphen to a slash character. */
>>> + if (mode == M_FILL)
>>> + *output++ = '/';
>>> + hyphen_buffered = NULL;
>>> + single_hyphen_seen = true;
>>> + }
>>> +
>>> + /* Output the non-hyphen character (including the terminating NUL)
>>> + * regardless of whether there was a buffered hyphen separator
(which,
>>> + * by now, we'll have attempted to translate and flush).
>>> + */
>>> + if (mode == M_FILL)
>>> + *output++ = c;
>>> + }
>>> +
>>> + ++input;
>>> + } while (c != '\0');
>>> +
>>> + /* We must have seen the VG-LV separator. If that's not the case,
we'll
>>> + * catch it before modifying "id".
>>> + */
>>> + if (!single_hyphen_seen) {
>>> + assert (mode == M_SCAN);
>>> + return;
>>> + }
>>> + }
>>> +}
>>
>> So this code can never return an error?
>
> It surely can; the one error mode is exactly the same as it was in v1
> (the regex version). Namely, the error mode is that the input pathname
> in "id" does not match the expected form (i.e., /dev/mapper/VG-LV), and
> then we don't modify "id" at all, we let "id" be added to the
keystore
> exactly as the user specified it.
>
> The error mode is activated whenever we reach any of the explicit return
> statements in the function:
>
> - when the first STRPREFIX call falls
>
> - or when any of the other three "return" statements are reached (which
> can only be reached during the scanning phase, M_SCAN; that is, *before*
> we write anything at all into "id").
>
> The point of separating M_SCAN from M_FILL is to verify the full format
> match before we start modifying "id" in-place. The traversal (parsing)
> through "VG-LV" is exactly the same in both cases, but in the first
> pass, we just check. If that completes fine, we repeat the same
> traversal, but then we also modify "id" in-place.
>
> The translation only ever contracts VG-LV (it produces at most as many
> bytes as it consumes), and the write point is always to the left of the
> read point, so the in-place modification, including the final
> NUL-termination, is safe.
>
>> eg if the input was "A-B-C",
>> I think it would translate the string to "A/B-C" which is an output
>> but the input seems like it was wrong.
>
> If "id" contains the string "/dev/mapper/A-B-C", then it
*mismatches*
> the expected form "/dev/mapper/VG-LV" just as much as an ID containing
> "/dev/sda2" would mismatch the expected form.
>
> In either of those cases, "id" does not get modified, and gets added to
> the keystore precisely as the user specified it on the command line.
>
> We don't say "this ID is wrong, error!". We don't know what the ID
is,
> we only know what it is *not*. We only say "this is not a
> /dev/mapper/VG-LV specification, we just don't know what it is supposed
> to be, so we won't touch it, we'll add it as it is, to the keystore".
>
> That's exactly the behavior (= blind addition) that we have right now,
> i.e., pre-patch.
OK got it, in that case:
Reviewed-by: Richard W.M. Jones <rjones(a)redhat.com>
> unescape_device_mapper_lvm() either modifies "id" in place (if
"id" on
> input has the expected form), or leaves "id" alone.
>
> The logic following the unescape_device_mapper_lvm() *call site* doesn't
> know which way the function went, but it does not *need* to know.
>
>> Or is it that only the VG is
>> escaped? (I don't imagine there is some specification for /dev/mapper
>> device names.)
>
> "/dev/mapper/A-B-C" simply cannot be interpreted according to the
> "/dev/mapper/VG-LV" scheme. For such an interpretation to be possible,
> we'd have to have one of the following strings in "id":
>
> - "/dev/mapper/A--B-C" --> "/dev/A-B/C"
> - "/dev/mapper/A-B--C" --> "/dev/A/B-C"
>
> If the user specifies (for example) "/dev/mapper/A-B-C", we reject that,
> so the ID remains "/dev/mapper/A-B-C", and that is what gets added to
> the keystore -- exactly as it happens pre-patch.
>
> If the user specifies (for example) "/dev/mapper/A--B--C", we also
> reject that (there is no separator to tell us what is the VG and what is
> the LV), so the ID remains "/dev/mapper/A--B--C", and that is what gets
> added to the keystore -- exactly as it happens pre-patch.
>
> FWIW, this aspect is identical between version 1 and version 2 of this
> patch: in v1, if the regex didn't match, then "id" would simply not be
> rewritten, and would get added to the keystore as specified by the user.
> The v1->v2 update is an implementation detal, internal to
> unescape_device_mapper_lvm() -- it only affects how the expected form is
> recognized (regex versus manual parsing), and how a matching input is
> translated (the output being constructed from matched regex
> subexpressions, versus byte-wise buffering and flushing integrated with
> scanning).
Thanks,
Rich.