[PATCH v2 1/7] util: introduce a parser for kernel cmdline arguments

Paulo de Rezende Pinatti ppinatti at linux.ibm.com
Fri Jun 12 14:51:40 UTC 2020


On 11/06/20 10:09, Erik Skultety wrote:
> On Wed, Jun 10, 2020 at 03:55:14PM +0200, Paulo de Rezende Pinatti wrote:
>>
>> On 08/06/20 16:35, Erik Skultety wrote:
>>> On Fri, May 29, 2020 at 12:10:03PM +0200, Paulo de Rezende Pinatti wrote:
>>>> Introduce two utility functions to parse a kernel command
>>>> line string according to the kernel code parsing rules in
>>>> order to enable the caller to perform operations such as
>>>> verifying whether certain argument=value combinations are
>>>> present or retrieving an argument's value.
>>>>
>>>> Signed-off-by: Paulo de Rezende Pinatti <ppinatti at linux.ibm.com>
>>>> Signed-off-by: Boris Fiuczynski <fiuczy at linux.ibm.com>
>>>> ---
>>>>    src/libvirt_private.syms |   2 +
>>>>    src/util/virutil.c       | 169 +++++++++++++++++++++++++++++++++++++++
>>>>    src/util/virutil.h       |  17 ++++
>>>>    tests/utiltest.c         | 141 ++++++++++++++++++++++++++++++++
>>>>    4 files changed, 329 insertions(+)
>>>>
>>>> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
>>>> index a6af44fe1c..a206a943c5 100644
>>>> --- a/src/libvirt_private.syms
>>>> +++ b/src/libvirt_private.syms
>>>> @@ -3433,6 +3433,8 @@ virHostGetDRMRenderNode;
>>>>    virHostHasIOMMU;
>>>>    virIndexToDiskName;
>>>>    virIsDevMapperDevice;
>>>> +virKernelCmdlineMatchParam;
>>>> +virKernelCmdlineNextParam;
>>>>    virMemoryLimitIsSet;
>>>>    virMemoryLimitTruncate;
>>>>    virMemoryMaxValue;
>>>> diff --git a/src/util/virutil.c b/src/util/virutil.c
>>>> index fb46501142..749c9d7116 100644
>>>> --- a/src/util/virutil.c
>>>> +++ b/src/util/virutil.c
>>>> @@ -1725,6 +1725,175 @@ virHostGetDRMRenderNode(void)
>>>>        return ret;
>>>>    }
>>>>
>>>> +
>>>> +static const char *virKernelCmdlineSkipDbQuote(const char *cmdline,
>>>
>>> minor nitpick: we can call ^this simply ...SkipQuote, we don't need to be so
>>> specific about it being a double quotation mark, do we?
>>>
>>
>> Sure, changed to SkipQuote.
>>
>>>> +                                               bool *is_quoted)
>>>> +{
>>>> +    if (cmdline[0] == '"') {
>>>> +        *is_quoted = !(*is_quoted);
>>>> +        cmdline++;
>>>> +    }
>>>> +    return cmdline;
>>>> +}
>>>> +
>>>> +
>>>> +static size_t virKernelCmdlineSearchForward(const char *cmdline,
>>>> +                                            bool *is_quoted,
>>>> +                                            bool include_equal)
>>>
>>> Hmm, what if instead we tried to find and return the index of the '=' character
>>> but iterated all the way until the next applicable (i.e. taking quotation into
>>> account) space and saved that end of arg/arg=val parameter into **res. The
>>> caller of this function would then continue directly from *res with the next
>>> arg/arg=val parameter.
>>> We could then call this something like virKernelCmdlineFindEqual and return -1
>>> if there is no '=' sign, indicating that it's a standalone parameter with no
>>> value.
>>>
>>
>> Yes, we can do that. It would look like this:
>>
>>      size_t index;
> 
> nitpick: ^this is a standard iteration variable, so naming it "i" might look
> more "expected"
> 

Changed.

>>      size_t equal_index = 0;
>>
>>      for (index = 0; cmdline[index]; index++) {
>>          if (!(is_quoted) && g_ascii_isspace(cmdline[index]))
>>              break;
>>          if (equal_index == 0 && cmdline[index] == '=') {
>>              equal_index = index;
>>              continue;
>>          }
>>          virKernelCmdlineSkipQuote(cmdline + index, &is_quoted);
>>      }
>>      *res = cmdline + index;
>>      return equal_index;
>>
>>
>> I found it more convenient to use 0 rather than -1 so that we can also
>> handle the case when the argument itself starts with the equal sign.
> 
> Yes, that's fine, but please provide a doc string for this function mentioning
> this.
> 

Will do.

> ...
> 
>>
>> So the function would look like this now:
>>
>>      virSkipSpaces(&cmdline);
>>      cmdline = virKernelCmdlineSkipQuote(cmdline, &is_quoted);
>>      equal_index = virKernelCmdlineFindEqual(cmdline, is_quoted, &next);
>>
>>      if (next == cmdline)
>>          return next;
>>
>>      /* param has no value */
>>      if (equal_index == 0) {
>>          if (is_quoted && next[-1] == '"')
>>              *param = g_strndup(cmdline, next - cmdline - 1);
>>          else
>>              *param = g_strndup(cmdline, next - cmdline);
>>          return next;
>>      }
>>
>>      *param = g_strndup(cmdline, equal_index);
>>
>>      if (cmdline[equal_index + 1] == '"') {
>>          is_quoted = true;
>>          equal_index++;
>>      }
>>
>>      if (is_quoted && next[-1] == '"')
>>          *val = g_strndup(cmdline + equal_index + 1,
>>                           next - cmdline - equal_index - 2);
>>      else
>>          *val = g_strndup(cmdline + equal_index + 1,
>>                           next - cmdline - equal_index - 1);
>>      return next;
>>
>> There's still a bit of handling required due to the kernel's quoting rules.
>> I took the liberty of using 'next' in place of 'res' as I think it conveys
>> better its purpose. Let me know if that looks good to you.
> 
> That is fine, looks good.
> 
>>
>>>> +}
>>>> +
>>>> +
>>>> +#define VIR_CMDLINE_STR_CMP(kernel_val, caller_val, flags) \
>>>> +    (((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ) && \
>>>> +      STREQ(kernel_val, caller_val)) || ((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX) && \
>>>> +                                         STRPREFIX(kernel_val, caller_val)))
>>>> +
>>>> +
>>>> +/*
>>>> + * Try to match the provided kernel cmdline string with the provided @arg
>>>> + * and the list @values of possible values according to the matching strategy
>>>> + * defined in @flags. Possible options include:
>>>> + * - VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX: do a substring comparison of values
>>>
>>> I know you used it in the following patches to match against the accepted
>>> values, but do we really need to match with a prefix, I'd be fine with simple
>>> stirng equality matching in all cases and not mix the matching strategy for
>>> both values and kernel arguments.
>>>
>>
>> The prefix based comparison is to make sure libvirt matches exactly like the
>> kernel code does for the parameter prot_virt (see arch/s390/kernel/uv.c ->
>> prot_virt_setup -> kstrtobool) which performs prefix based matching. Using
>> equality matching should work for most cases but there would be certain
>> corner cases missed (i.e. prot_virt=yfoo will be recognized as 'activate' by
>> the kernel but not by libvirt).
> 
> That..is...interesting...
> 
>>
>> If you think such cases are not worth the trouble I can remove the PREFIX
>> flag, let me know what you prefer.
> 
> Well, given that kernel supposedly recognizes "yfoo" as "on" just because of
> the prefix, we obviously can't neglect the PREFIX matching can we? If we did
> and someone indeed used such a horrible example of a cmdline,
> virt-host-validate would lie about the feature not being enabled (even though
> it would be) because libvirt would only do an equality match, so it has to stay.
> 
> <reprove>bad kernel, bad kernel</reprove>
> 
> Just to re-assure myself, that's a generic behavior used for all option parsing
> in the kernel not just prot_virt, right? If so, then contrary to my previous
> response, we should always use and default to prefix matching, because given
> the situation equality matching would clearly not be a satisfactory behaviour.
> 

No, actually other parameters might use equality matching, this is the 
case with 'mem_encrypt' (see arch/x86/mm/mem_encrypt_identity.c -> 
sme_enable -> strncmp at line 575) which was being checked for SEV in 
the first version of this series. I think we are fine with the current 
approach as we allow the caller to choose which matching strategy to use 
by specifying the appropriate flag.

> ...
> 
>>
>>
>>>> + * - VIR_KERNEL_CMDLINE_FLAGS_SEARCH_STICKY: first positive match satifies search
>>>
>>> is "sticky searching" some terminus technicus? If not, we probably should name
>>> this FLAGS_MATCH_FIRST and FLAGS_MATCH_LAST respectively.
>>>
>>
>> I guess the flag explanation was not clear enough, sticky actually means
>> "stop as soon as you find a match, no matter the order", so
> 
> Well, isn't "stop as soon as you find a match" == match first occurrence?
> 

I suppose one can also see that way :) I will thus use SEARCH_FIRST as 
requested.

>> FLAGS_SEARCH_FIRST doesn't really fit here. A better name could be
>> FLAGS_SEARCH_ANY. Let me know if you agree.
>>
> 
> ...
> 
>>>>
>>>> +struct testKernelCmdlineNextParamData
>>>> +{
>>>> +    const char *cmdline;
>>>> +    const char *param;
>>>> +    const char *val;
>>>> +    const char *next;
>>>> +};
>>>> +
>>>> +static struct testKernelCmdlineNextParamData kEntries[] = {
>>>> +    { "arg1 arg2 arg3=val1",                            "arg1",           NULL,                  " arg2 arg3=val1" },
>>>> +    { "arg1=val1 arg2 arg3=val3 arg4",                  "arg1",           "val1",                " arg2 arg3=val3 arg4" },
>>>> +    { "arg3=val3 ",                                     "arg3",           "val3",                " " },
>>>> +    { "arg3=val3",                                      "arg3",           "val3",                "" },
>>>> +    { "arg-3=val3 arg4",                                "arg-3",          "val3",                " arg4" },
>>>> +    { "arg_3=val3 arg4",                                "arg-3",          "val3",                " arg4" },
>>>> +    { " arg_3=val3 arg4",                               "arg-3",          "val3",                " arg4" },
>>>> +    { "  arg-3=val3 arg4",                              "arg-3",          "val3",                " arg4" },
>>>> +    { "arg2=\"value with spaces\" arg3=val3",           "arg2",           "value with spaces",   " arg3=val3" },
>>>> +    { " arg2=\"value with spaces\"   arg3=val3",        "arg2",           "value with spaces",   "   arg3=val3" },
>>>> +    { "  \"arg2=value with spaces\" arg3=val3",         "arg2",           "value with spaces",   " arg3=val3" },
>>>> +    { "arg2=\"val\"ue arg3",                            "arg2",           "val\"ue",             " arg3" },
>>>> +    { " arg3\" escaped=val2\"",                         "arg3\" escaped", "val2",                "" },
>>>
>>> ^Is this even valid for the kernel itself? Looking at [1], they clearly don't
>>> allow escaped \" in the arg/value.
>>>
>>> [1] https://github.com/torvalds/linux/blob/db54615e21419c3cb4d699a0b0aa16cc44d0e9da/lib/cmdline.c
>>>
>>
>> I guess the word "escaped" in this test is a bit misleading; it's actually
>> escaping the blank space, not the quote itself. This is valid for the
>> kernel. In order to assure our parsing results would match those of the
>> kernel I executed the code in cmdline.c in a standalone file to generate the
>> reference values for the test.
>>
>> I'll change "arg3\" escaped" to "arg3\" with spaces" to clearly state the
>> intention here.
> 
> Sure, that will help, although I wasn't relating to the word "escaped" in my
> reply. Nevermind though, my bad, I just couldn't wrap my head around seeing
> "val\"ue" and somehow could not make sense out of it, too much QE to process I
> guess, it's fine.
> 
> Regards,
> Erik
> 

-- 
Best regards,

Paulo de Rezende Pinatti




More information about the libvir-list mailing list