[libvirt] [PATCH v4 1/2] qemu_domain: NVLink2 bridge detection function for PPC64

Daniel Henrique Barboza danielhb413 at gmail.com
Tue Apr 2 20:27:55 UTC 2019



On 4/2/19 5:34 AM, Erik Skultety wrote:
> On Tue, Mar 12, 2019 at 06:55:49PM -0300, Daniel Henrique Barboza wrote:
>> The NVLink2 support in QEMU implements the detection of NVLink2
>> capable devices by verifying the attributes of the VFIO mem region
>> QEMU allocates for the NVIDIA GPUs. To properly allocate an
>> adequate amount of memLock, Libvirt needs this information before
>> a QEMU instance is even created, thus querying QEMU is not
>> possible and opening a VFIO window is too much.
>>
>> An alternative is presented in this patch. Making the following
>> assumptions:
>>
>> - if we want GPU RAM to be available in the guest, an NVLink2 bridge
>> must be passed through;
>>
>> - an unknown PCI device can be classified as a NVLink2 bridge
>> if its device tree node has 'ibm,gpu', 'ibm,nvlink',
>> 'ibm,nvlink-speed' and 'memory-region'.
> Alexey mentioned that it should be enough to check for the properties ^above.
> I'm just wondering, knowing this is IBM's PPC8/9 if the assumptions we have
> made are going to stay with further revisions of PPC, NVLink and GPUs, IOW
> we need to be sure "ibm,nvlink" won't be renamed with further revisions, e.g.
> other cards than V100 in the future, because then compatibility and revision
> selection comes into the picture.

Perhaps Alexey or Piotr can comment on this. I can't confirm that the
device node naming will remain as is in the long run.


>> This patch introduces a helper called @ppc64VFIODeviceIsNV2Bridge
>> that checks the device tree node of a given PCI device and
>> check if it meets the criteria to be a NVLink2 bridge. This
> Just out of curiosity, what about NVLink 1.0? Apart from performance, I wasn't
> able to find something useful in terms of compatibility, is there something to
> consider, since we're only relying on NVLink 2.0?

NVLink1, as far as Libvirt goes, works similar to a regular GPU passthrough.
There is no changes in the memory topology in QEMU that needs extra
code to adjust rlimit.

I am not entirely sure if the code above will not generate a false-positive,
mistakenly detecting NV2 scenario for a NV1 GPU. I think the 'memory-region'
attribute won't be present in a NV1 bridge. Piotr, can you comment here?



>
>> new function will be used in a follow-up patch that, using the
>> first assumption, will set up the rlimits of the guest
>> accordingly.
>>
>> Signed-off-by: Daniel Henrique Barboza <danielhb413 at gmail.com>
>> ---
>>   src/qemu/qemu_domain.c | 29 +++++++++++++++++++++++++++++
>>   1 file changed, 29 insertions(+)
>>
>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>> index 1659e88478..dcc92d253c 100644
>> --- a/src/qemu/qemu_domain.c
>> +++ b/src/qemu/qemu_domain.c
>> @@ -10398,6 +10398,35 @@ qemuDomainUpdateCurrentMemorySize(virDomainObjPtr vm)
>>   }
>>
>>
>> +/**
>> + * ppc64VFIODeviceIsNV2Bridge:
>> + * @device: string with the PCI device address
>> + *
>> + * This function receives a string that represents a PCI device,
>> + * such as '0004:04:00.0', and tells if the device is a NVLink2
>> + * bridge.
>> + */
>> +static bool
>> +ppc64VFIODeviceIsNV2Bridge(const char *device)
>> +{
>> +    const char *nvlink2Files[] = {"ibm,gpu", "ibm,nvlink",
>> +                                  "ibm,nvlink-speed", "memory-region"};
> Like I said above, if ^this is not to change, then I'm okay in principle, still
> I feel like this needs David's ACK (putting him on CC)
>
> Codewise, Peter already provided you with comments.
>
> Erik
>
>
>> +    char *file;
>> +    size_t i;
>> +
>> +    for (i = 0; i < 4; i++) {
>> +        if ((virAsprintf(&file, "/sys/bus/pci/devices/%s/of_node/%s",
>> +                         device, nvlink2Files[i])) < 0)
>> +            return false;
>> +
>> +        if (!virFileExists(file))
>> +            return false;
>> +    }
>> +
>> +    return true;
>> +}
>> +
>> +
>>   /**
>>    * getPPC64MemLockLimitBytes:
>>    * @def: domain definition
>> --
>> 2.20.1
>>
>> --
>> libvir-list mailing list
>> libvir-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/libvir-list




More information about the libvir-list mailing list