[PATCH 1/6] util: introduce a parser for kernel cmdline arguments

Paulo de Rezende Pinatti ppinatti at linux.ibm.com
Thu May 28 14:50:05 UTC 2020



On 15/05/20 16:19, Erik Skultety wrote:
> On Mon, May 11, 2020 at 06:41:56PM +0200, Boris Fiuczynski wrote:
>> From: Paulo de Rezende Pinatti <ppinatti at linux.ibm.com>
>>
>> Introduce two utility functions to parse a kernel command
>> line string according to the kernel code parsing rules in
>> order to enable the caller to perform operations such as
>> verifying whether certain argument=value combinations are
>> present or retrieving an argument's value.
>>
>> Signed-off-by: Paulo de Rezende Pinatti <ppinatti at linux.ibm.com>
>> Reviewed-by: Boris Fiuczynski <fiuczy at linux.ibm.com>
>> ---
>>   src/libvirt_private.syms |   2 +
>>   src/util/virutil.c       | 173 +++++++++++++++++++++++++++++++++++++++
>>   src/util/virutil.h       |  18 ++++
>>   tests/utiltest.c         | 130 +++++++++++++++++++++++++++++
>>   4 files changed, 323 insertions(+)
>>
>> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
>> index 935ef7303b..25b22a3e3f 100644
>> --- a/src/libvirt_private.syms
>> +++ b/src/libvirt_private.syms
>> @@ -3428,6 +3428,8 @@ virHostGetDRMRenderNode;
>>   virHostHasIOMMU;
>>   virIndexToDiskName;
>>   virIsDevMapperDevice;
>> +virKernelCmdlineGetValue;
>> +virKernelCmdlineMatchParam;
>>   virMemoryLimitIsSet;
>>   virMemoryLimitTruncate;
>>   virMemoryMaxValue;
>> diff --git a/src/util/virutil.c b/src/util/virutil.c
>> index fb46501142..264aae8d01 100644
>> --- a/src/util/virutil.c
>> +++ b/src/util/virutil.c
>> @@ -1725,6 +1725,179 @@ virHostGetDRMRenderNode(void)
>>       return ret;
>>   }
>>   
>> +
>> +#define VIR_IS_CMDLINE_SPACE(value) \
>> +    (g_ascii_isspace(value) || (unsigned char) value == 160)
> 
> Hmm, we're not checking the non-breaking space anywhere else in the code, so I
> think we'd be fine not checking it here either, so plain g_ascii_isspace()
> would suffice in the code
> 
>> +
>> +
>> +static bool virKernelArgsAreEqual(const char *arg1,
>> +                                  const char *arg2,
>> +                                  size_t n)
>> +{
>> +    size_t i;
>> +
>> +    for (i = 0; i < n; i++) {
>> +        if ((arg1[i] == '-' && arg2[i] == '_') ||
>> +            (arg1[i] == '_' && arg2[i] == '-') || (arg1[i] == arg2[i])) {
> 
> Because '-' and '_' are equal in the parameter syntax, rather than introducing
> another string equality function just because of this, we should normalize the
> inputs by replacing occurrences of one with the other and then simply do STREQ
> at the appropriate spot in the code
> 
> 
>> +            if (arg1[i] == '\0')
>> +                return true;
>> +            continue;
>> +        }
>> +        return false;
>> +    }
>> +    return true;
>> +}
>> +
>> +
>> +/*
>> + * Parse the kernel cmdline and store the value of @arg in @val
>> + * which can be NULL if @arg has no value or if it's not found.
>> + * In addition, store in @next the address right after
>> + * @arg=@value for possible further processing.
>> + *
>> + * @arg: kernel command line argument
>> + * @cmdline: kernel command line string to be checked for @arg
>> + * @val: pointer to hold retrieved value of @arg
>> + * @next: pointer to hold address right after @arg=@val, will
>> + * point to the string's end (NULL) in case @arg is not found
>> + *
>> + * Returns 0 if @arg is present in @cmdline, 1 otherwise
>> + */
>> +int virKernelCmdlineGetValue(const char *arg,
>> +                             const char *cmdline,
>> +                             char **val,
>> +                             const char **next)
>> +{
>> +    size_t i = 0, param_i;
> 
> in this specific case I think that naming the iteration variable param_i is
> more confusing than actually settling down with something like "offset"
> especially when you're doing arithmetics with it.
> 
>> +    size_t arg_len = strlen(arg);
>> +    bool is_quoted, match;
> 
> 1 declaration/definition per line please
> 
>> +
>> +    *next = cmdline;
>> +    *val = NULL;
>> +
>> +    while (cmdline[i] != '\0') {
>> +        /* remove leading spaces */
>> +        while (VIR_IS_CMDLINE_SPACE(cmdline[i]))
>> +            i++;
> 
> For ^this, we already have a primitive virSkipSpaces()
> 
>> +        if (cmdline[i] == '"') {
>> +            i++;
>> +            is_quoted = true;
>> +        } else {
>> +            is_quoted = false;
>> +        }
>> +        for (param_i = i; cmdline[param_i]; param_i++) {
>> +            if ((VIR_IS_CMDLINE_SPACE(cmdline[param_i]) && !is_quoted) ||
>> +                cmdline[param_i] == '=')
>> +                break;
>> +            if (cmdline[param_i] == '"')
>> +                is_quoted = !is_quoted;
>> +        }
>> +        if (param_i-i == arg_len && virKernelArgsAreEqual(cmdline+i, arg, arg_len))
> 
> We don't use Yoda conditions, so ^this should be arg_len == param_i - i
> 
>> +            match = true;
>> +        else
>> +            match = false;
>> +        /* arg has no value */
>> +        if (cmdline[param_i] != '=') {
>> +            if (match) {
>> +                *next = cmdline+param_i;
> 
> please use a space in between the operands and the operator
> 
>> +                return 0;
>> +            }
>> +            i = param_i;
>> +            continue;
>> +        }
>> +        param_i++;
>> +        if (cmdline[param_i] == '\0')
>> +            break;
>> +
>> +        if (cmdline[param_i] == '"') {
>> +            param_i++;
>> +            is_quoted = !is_quoted;
>> +        }
>> +        i = param_i;
>> +
>> +        for (; cmdline[param_i]; param_i++) {
>> +            if (VIR_IS_CMDLINE_SPACE(cmdline[param_i]) && !is_quoted)
>> +                break;
>> +            if (cmdline[param_i] == '"')
>> +                is_quoted = !is_quoted;
>> +        }
>> +        if (match) {
>> +            *next = cmdline+param_i;
>> +            if (cmdline[param_i-1] == '"')
>> +                *val = g_strndup(cmdline+i, param_i-i-1);
>> +            else
>> +                *val = g_strndup(cmdline+i, param_i-i);
>> +            return 0;
>> +        }
>> +        i = param_i;
>> +    }
>> +    *next = cmdline+i;
>> +    return 1;
> 
> Anyhow, this function is IMO doing too much on its own - parsing tokens,
> matching the arguments, and extracting parameters and their values. All of that
> should be split into helpers to enhance readability. Unfortunately you can't
> use our existing tokenizer virStringSplit because of values containing spaces,
> so you'll have to write your own that takes that into account, it should return
> a token (parameter or parameter=value), then parse the returned token into
> key-value pair, match the key with the input key and return whatever is needed
> to be returned.
> 
>> +}
>> +
>> +
>> +#define VIR_CMDLINE_STR_CMP(kernel_val, caller_val, flags) \
>> +    (((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ) && \
>> +      STREQ(kernel_val, caller_val)) || ((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX) && \
>> +                                         STRPREFIX(kernel_val, caller_val)))
>> +
>> +
>> +/*
>> + * Try to match the provided kernel cmdline string with the provided @arg
>> + * and the list @values of possible values according to the matching strategy
>> + * defined in @flags. Possible options include:
>> + * - VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX: do a substring comparison of values
>> + *   (uses size of value provided as input)
>> + * - VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ: do a strict string comparison
>> + * - VIR_KERNEL_CMDLINE_FLAGS_SEARCH_STICKY: first positive match satifies search
> 
> Why are ^these constants needed? Especially the last two. Kernel is parsing the
> parameters sequentially, so the last one wins. We should match that behaviour
> when determining the value of a setting and ignore any previous occurrences of
> a parameter.

It's not always the case that the last one wins. In the case of the 
prot_virt parameter for example as soon as a positive value (prot_virt=1 
or prot_virt=y) shows up in the kernel cmdline the feature will be 
enabled, even if the last entry is negative. In order to handle the two 
possible behaviors we introduced the *_SEARCH flags SEARCH_STICKY 
(prot_virt behavior) and SEARCH_LAST (usual behavior, last one wins).

As to the *_CMP flags, prot_virt again accepts multiple values and does 
a prefix-based check (see arch/s390/kernel/uv.c -> prot_virt_setup -> 
kstrtobool) so we introduced CMP_PREFIX to handle that, while CMP_EQ is 
for the usual case when the whole value is considered.

> 
> Regards,
> Erik
> 

-- 
Best regards,

Paulo de Rezende Pinatti




More information about the libvir-list mailing list