[PATCH 1/6] util: introduce a parser for kernel cmdline arguments

Boris Fiuczynski fiuczy at linux.ibm.com
Thu May 28 14:31:03 UTC 2020


On 5/15/20 4:19 PM, Erik Skultety wrote:
> On Mon, May 11, 2020 at 06:41:56PM +0200, Boris Fiuczynski wrote:
>> From: Paulo de Rezende Pinatti <ppinatti at linux.ibm.com>
>>
>> Introduce two utility functions to parse a kernel command
>> line string according to the kernel code parsing rules in
>> order to enable the caller to perform operations such as
>> verifying whether certain argument=value combinations are
>> present or retrieving an argument's value.
>>
>> Signed-off-by: Paulo de Rezende Pinatti <ppinatti at linux.ibm.com>
>> Reviewed-by: Boris Fiuczynski <fiuczy at linux.ibm.com>
>> ---
>>   src/libvirt_private.syms |   2 +
>>   src/util/virutil.c       | 173 +++++++++++++++++++++++++++++++++++++++
>>   src/util/virutil.h       |  18 ++++
>>   tests/utiltest.c         | 130 +++++++++++++++++++++++++++++
>>   4 files changed, 323 insertions(+)
>>
>> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
>> index 935ef7303b..25b22a3e3f 100644
>> --- a/src/libvirt_private.syms
>> +++ b/src/libvirt_private.syms
>> @@ -3428,6 +3428,8 @@ virHostGetDRMRenderNode;
>>   virHostHasIOMMU;
>>   virIndexToDiskName;
>>   virIsDevMapperDevice;
>> +virKernelCmdlineGetValue;
>> +virKernelCmdlineMatchParam;
>>   virMemoryLimitIsSet;
>>   virMemoryLimitTruncate;
>>   virMemoryMaxValue;
>> diff --git a/src/util/virutil.c b/src/util/virutil.c
>> index fb46501142..264aae8d01 100644
>> --- a/src/util/virutil.c
>> +++ b/src/util/virutil.c
>> @@ -1725,6 +1725,179 @@ virHostGetDRMRenderNode(void)
>>       return ret;
>>   }
>>   
>> +
>> +#define VIR_IS_CMDLINE_SPACE(value) \
>> +    (g_ascii_isspace(value) || (unsigned char) value == 160)
> 
> Hmm, we're not checking the non-breaking space anywhere else in the code, so I
> think we'd be fine not checking it here either, so plain g_ascii_isspace()
> would suffice in the code

Paulo wanted to make sure the function would always produce the same 
result as the kernel code itself and the kernel code recognizes 
character 160 as space.
In case you'd like to check that, see lib/cmdline.c -> function next_arg 
-> isspace -> include/linux/ctype.h -> __ismask & _S -> lib/ctype.c has 
a table with the entries for _S, including the character 160 which 
stands for "non breaking space" character.
Anyway since it seems an unlikely case I removed it as requested.

> 
>> +
>> +
>> +static bool virKernelArgsAreEqual(const char *arg1,
>> +                                  const char *arg2,
>> +                                  size_t n)
>> +{
>> +    size_t i;
>> +
>> +    for (i = 0; i < n; i++) {
>> +        if ((arg1[i] == '-' && arg2[i] == '_') ||
>> +            (arg1[i] == '_' && arg2[i] == '-') || (arg1[i] == arg2[i])) {
> 
> Because '-' and '_' are equal in the parameter syntax, rather than introducing
> another string equality function just because of this, we should normalize the
> inputs by replacing occurrences of one with the other and then simply do STREQ
> at the appropriate spot in the code

While reworking the parsing it has been changed.

> 
> 
>> +            if (arg1[i] == '\0')
>> +                return true;
>> +            continue;
>> +        }
>> +        return false;
>> +    }
>> +    return true;
>> +}
>> +
>> +
>> +/*
>> + * Parse the kernel cmdline and store the value of @arg in @val
>> + * which can be NULL if @arg has no value or if it's not found.
>> + * In addition, store in @next the address right after
>> + * @arg=@value for possible further processing.
>> + *
>> + * @arg: kernel command line argument
>> + * @cmdline: kernel command line string to be checked for @arg
>> + * @val: pointer to hold retrieved value of @arg
>> + * @next: pointer to hold address right after @arg=@val, will
>> + * point to the string's end (NULL) in case @arg is not found
>> + *
>> + * Returns 0 if @arg is present in @cmdline, 1 otherwise
>> + */
>> +int virKernelCmdlineGetValue(const char *arg,
>> +                             const char *cmdline,
>> +                             char **val,
>> +                             const char **next)
>> +{
>> +    size_t i = 0, param_i;
> 
> in this specific case I think that naming the iteration variable param_i is
> more confusing than actually settling down with something like "offset"
> especially when you're doing arithmetics with it.

Changed it.

> 
>> +    size_t arg_len = strlen(arg);
>> +    bool is_quoted, match;
> 
> 1 declaration/definition per line please

Changed it.

> 
>> +
>> +    *next = cmdline;
>> +    *val = NULL;
>> +
>> +    while (cmdline[i] != '\0') {
>> +        /* remove leading spaces */
>> +        while (VIR_IS_CMDLINE_SPACE(cmdline[i]))
>> +            i++;
> 
> For ^this, we already have a primitive virSkipSpaces()

Changed it.

> 
>> +        if (cmdline[i] == '"') {
>> +            i++;
>> +            is_quoted = true;
>> +        } else {
>> +            is_quoted = false;
>> +        }
>> +        for (param_i = i; cmdline[param_i]; param_i++) {
>> +            if ((VIR_IS_CMDLINE_SPACE(cmdline[param_i]) && !is_quoted) ||
>> +                cmdline[param_i] == '=')
>> +                break;
>> +            if (cmdline[param_i] == '"')
>> +                is_quoted = !is_quoted;
>> +        }
>> +        if (param_i-i == arg_len && virKernelArgsAreEqual(cmdline+i, arg, arg_len))
> 
> We don't use Yoda conditions, so ^this should be arg_len == param_i - i

Corrected

> 
>> +            match = true;
>> +        else
>> +            match = false;
>> +        /* arg has no value */
>> +        if (cmdline[param_i] != '=') {
>> +            if (match) {
>> +                *next = cmdline+param_i;
> 
> please use a space in between the operands and the operator

Corrected

> 
>> +                return 0;
>> +            }
>> +            i = param_i;
>> +            continue;
>> +        }
>> +        param_i++;
>> +        if (cmdline[param_i] == '\0')
>> +            break;
>> +
>> +        if (cmdline[param_i] == '"') {
>> +            param_i++;
>> +            is_quoted = !is_quoted;
>> +        }
>> +        i = param_i;
>> +
>> +        for (; cmdline[param_i]; param_i++) {
>> +            if (VIR_IS_CMDLINE_SPACE(cmdline[param_i]) && !is_quoted)
>> +                break;
>> +            if (cmdline[param_i] == '"')
>> +                is_quoted = !is_quoted;
>> +        }
>> +        if (match) {
>> +            *next = cmdline+param_i;
>> +            if (cmdline[param_i-1] == '"')
>> +                *val = g_strndup(cmdline+i, param_i-i-1);
>> +            else
>> +                *val = g_strndup(cmdline+i, param_i-i);
>> +            return 0;
>> +        }
>> +        i = param_i;
>> +    }
>> +    *next = cmdline+i;
>> +    return 1;
> 
> Anyhow, this function is IMO doing too much on its own - parsing tokens,
> matching the arguments, and extracting parameters and their values. All of that
> should be split into helpers to enhance readability. Unfortunately you can't
> use our existing tokenizer virStringSplit because of values containing spaces,
> so you'll have to write your own that takes that into account, it should return
> a token (parameter or parameter=value), then parse the returned token into
> key-value pair, match the key with the input key and return whatever is needed
> to be returned.

v2 will have a reworked approach which is more like you outlined above.

> 
>> +}
>> +
>> +
>> +#define VIR_CMDLINE_STR_CMP(kernel_val, caller_val, flags) \
>> +    (((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ) && \
>> +      STREQ(kernel_val, caller_val)) || ((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX) && \
>> +                                         STRPREFIX(kernel_val, caller_val)))
>> +
>> +
>> +/*
>> + * Try to match the provided kernel cmdline string with the provided @arg
>> + * and the list @values of possible values according to the matching strategy
>> + * defined in @flags. Possible options include:
>> + * - VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX: do a substring comparison of values
>> + *   (uses size of value provided as input)
>> + * - VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ: do a strict string comparison
>> + * - VIR_KERNEL_CMDLINE_FLAGS_SEARCH_STICKY: first positive match satifies search
> 
> Why are ^these constants needed? Especially the last two. Kernel is parsing the
> parameters sequentially, so the last one wins. We should match that behaviour
> when determining the value of a setting and ignore any previous occurrences of
> a parameter.
> 
> Regards,
> Erik
> 


-- 
Mit freundlichen Grüßen/Kind regards
    Boris Fiuczynski

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294





More information about the libvir-list mailing list