[libvirt] [PATCH] conf/qemu: enforce NUMA nodes only for x86 memory hotplug

Tue Sep 15 08:00:42 UTC 2015

Peter Krempa <pkrempa at redhat.com> writes:

> On Tue, Aug 18, 2015 at 15:35:11 +0530, Nikunj A Dadhania wrote:
>> libvirt enforces at least one NUMA node for memory hotplug support on
>> all architectures. While it might be required for some x86 guest,
>> PowerPC can hotplug memory on non-NUMA system.
>> 
>> The generic checks are replaced with arch specific check and xml
>> validation too does not enforce "node" for non-x86 arch.
>> 
>> CC: Peter Krempa <pkrempa at redhat.com>
>> Signed-off-by: Nikunj A Dadhania <nikunj at linux.vnet.ibm.com>
>> ---
>>  src/conf/domain_conf.c  |  9 ++++++---
>>  src/qemu/qemu_command.c | 28 +++++++++++++++++-----------
>>  2 files changed, 23 insertions(+), 14 deletions(-)
>> 
>> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
>> index fd0450f..4cb2d4a 100644
>> --- a/src/conf/domain_conf.c
>> +++ b/src/conf/domain_conf.c
>
> ...
>
>> @@ -12437,7 +12438,7 @@ virDomainMemoryTargetDefParseXML(xmlNodePtr node,
>>      xmlNodePtr save = ctxt->node;
>>      ctxt->node = node;
>>  
>> -    if (virXPathUInt("string(./node)", ctxt, &def->targetNode) < 0) {
>> +    if (virXPathUInt("string(./node)", ctxt, &def->targetNode) < 0 && ARCH_IS_X86(domDef->os.arch)) {
>
> The parser code should not be made architecture dependant. In this case
> we will need to adjust the code in a way that it will set a known value
> in case the numa node was not provided in the device XML and the check
> itself will need to be moved into the post parse callback so that the
> decision can be made on a per-hypervisor basis.

Sure, the only requirement is node should not be made mandatory in
parser code.

>
>>          virReportError(VIR_ERR_XML_ERROR, "%s",
>>                         _("invalid or missing value of memory device node"));
>>          goto cleanup;
>
> ...
>
>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>> index ae03618..51160e7 100644
>> --- a/src/qemu/qemu_command.c
>> +++ b/src/qemu/qemu_command.c
>> @@ -4979,8 +4979,12 @@ qemuBuildMemoryBackendStr(unsigned long long size,
>>      *backendProps = NULL;
>>      *backendType = NULL;
>>  
>> -    /* memory devices could provide a invalid guest node */
>> -    if (guestNode >= virDomainNumaGetNodeCount(def->numa)) {
>> +    /* memory devices could provide a invalid guest node. Moreover,
>> +     * x86 guests needs at least one numa node to support memory
>> +     * hotplug
>> +     */
>> +    if ((virDomainNumaGetNodeCount(def->numa) == 0 && ARCH_IS_X86(def->os.arch)) ||
>> +        guestNode > virDomainNumaGetNodeCount(def->numa)) {
>
> If we make this ARCH dependent here it will be hard to adjust it again
> in the future. Also I think we should whitelist PPC rather than
> blacklisting x86, since other ARCHes and OSes might have the same
> problem here.

Sure.

>
>>          virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
>>                         _("can't add memory backend for guest node '%d' as "
>>                           "the guest has only '%zu' NUMA nodes configured"),
>> @@ -4991,10 +4995,12 @@ qemuBuildMemoryBackendStr(unsigned long long size,
>>      if (!(props = virJSONValueNewObject()))
>>          return -1;
>>  
>> -    memAccess = virDomainNumaGetNodeMemoryAccessMode(def->numa, guestNode);
>> -    if (virDomainNumatuneGetMode(def->numa, guestNode, &mode) < 0 &&
>> -        virDomainNumatuneGetMode(def->numa, -1, &mode) < 0)
>> -        mode = VIR_DOMAIN_NUMATUNE_MEM_STRICT;
>> +    if (virDomainNumaGetNodeCount(def->numa)) {
>> +        memAccess = virDomainNumaGetNodeMemoryAccessMode(def->numa, guestNode);
>> +        if (virDomainNumatuneGetMode(def->numa, guestNode, &mode) < 0 &&
>> +            virDomainNumatuneGetMode(def->numa, -1, &mode) < 0)
>> +            mode = VIR_DOMAIN_NUMATUNE_MEM_STRICT;
>> +    }
>>  
>>      if (pagesize == 0) {
>>          /* Find the huge page size we want to use */
>> @@ -9238,11 +9244,11 @@ qemuBuildCommandLine(virConnectPtr conn,
>>              goto error;
>>          }
>>  
>> -        /* due to guest support, qemu would silently enable NUMA with one node
>> -         * once the memory hotplug backend is enabled. To avoid possible
>> -         * confusion we will enforce user originated numa configuration along
>> -         * with memory hotplug. */
>> -        if (virDomainNumaGetNodeCount(def->numa) == 0) {
>> +        /* x86 windows guest needs at least one numa node to be
>> +         * present. While its not possible to detect what guest os is
>> +         * running, enforce this limitation only to x86 architecture.
>
> Actually, qemu would add the numa node anyways, so the libvirt XML would
> not correspond to the configuration the guest sees and to avoid that we
> enforce the numa node.

I did not see this in my testing for PPC64.

>
>> +         */
>> +        if (ARCH_IS_X86(def->os.arch) && virDomainNumaGetNodeCount(def->numa) == 0) {
>>              virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>>                             _("At least one numa node has to be configured when "
>>                               "enabling memory hotplug"));
>
> Additionally, there's a bug in libvirt, where we'd use incorrect memory
> sizes when hotplug would be enabled without a numa node. I have a patch
> for this issue. Since this patch needs to be almost completely reworked
> I'll propose a patch that will lift this limitation without introducing
> arch specific code in multiple places.

That will be great, thanks

Regards
Nikunj