[libvirt] [PATCH 3/5] qemu: prefer memfd for anonymous memory
Marc-André Lureau
marcandre.lureau at redhat.com
Tue Sep 11 07:53:57 UTC 2018
Hi
On Tue, Sep 11, 2018 at 2:46 AM, John Ferlan <jferlan at redhat.com> wrote:
>
> On 09/07/2018 07:32 AM, marcandre.lureau at redhat.com wrote:
>> From: Marc-André Lureau <marcandre.lureau at redhat.com>
>>
>
> Would be nice to have a few more words here. If you provide them I can
> add them... The if statement is difficult to read unless you know what
> each field really means.
hostmem-memfd is quite similar to hostmem-file. The main benefits are
that it doesn't need to create filesystem files, and it also enforces
sealing, providing a bit more safety.
> secondary question - should we document what gets used?, e.g.:
>
> https://libvirt.org/formatdomain.html#elementsMemoryBacking
>
> Seems to me the preference to use memfd is for memory backing using
> anonymous source for nvdimm's without a defined path, but sometimes my
> wording doesn't match reality.
Yes it could be documented. But it's now an allocation decision that
could evolve, or an implementation detail.
Would you like to see something like that?
<dt><code>source</code></dt>
- <dd>In this attribute you can switch to file memorybacking or keep
- default anonymous.</dd>
+ <dd>In this attribute you can switch to file memorybacking or
+ keep default anonymous. <span class="since">Since 4.8.0</span>,
+ when the memory is anonymous and the host supports it, libvirt
+ will use a memfd memory backing, providing additional safety
+ guarantees.
+ </dd>
<dt><code>access</code></dt>
>
>> Signed-off-by: Marc-André Lureau <marcandre.lureau at redhat.com>
>> ---
>> src/qemu/qemu_command.c | 61 +++++++++++++++++++++++++++++------------
>> 1 file changed, 43 insertions(+), 18 deletions(-)
>>
>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>> index c9e3a91e32..97cfc8a18d 100644
>> --- a/src/qemu/qemu_command.c
>> +++ b/src/qemu/qemu_command.c
>> @@ -3113,6 +3113,24 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>> return ret;
>> }
>>
>> +static int
>> +qemuBuildMemoryBackendPropsShare(virJSONValuePtr props,
>> + virDomainMemoryAccess memAccess)
>> +{
>> + switch (memAccess) {
>> + case VIR_DOMAIN_MEMORY_ACCESS_SHARED:
>> + return virJSONValueObjectAdd(props, "b:share", true, NULL);
>> +
>> + case VIR_DOMAIN_MEMORY_ACCESS_PRIVATE:
>> + return virJSONValueObjectAdd(props, "b:share", false, NULL);
>> +
>> + case VIR_DOMAIN_MEMORY_ACCESS_DEFAULT:
>> + case VIR_DOMAIN_MEMORY_ACCESS_LAST:
>> + break;
>> + }
>> +
>> + return 0;
>> +}
>>
>> /**
>> * qemuBuildMemoryBackendProps:
>
> The comments should have been updated... In particular:
>
> "Then, if one of the two memory-backend-* should be used..."
>
>> @@ -3259,7 +3277,23 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
>> if (!(props = virJSONValueNewObject()))
>> return -1;
>>
>> - if (useHugepage || mem->nvdimmPath || memAccess ||
>
> Is this preference over "-ram" or "-file"? It would seem to me someone
> choosing "file" has a specific case and this is more for those other
> options where if capabilities exist, then we try to use them.
(tbh, I don't know if you could have both nvdimmPath source ==
ANONYMOUS. That seems incompatible to me)
So the 'if' statement reads: if memory source is anonymous, and qemu
supports memfd (and if hugepage is requested and memfd supports it),
then let's use memfd, otherwise, keep/use the existing allocation
rules.
>> + if (!mem->nvdimmPath &&
>> + def->mem.source == VIR_DOMAIN_MEMORY_SOURCE_ANONYMOUS &&
>> + virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD) &&
>> + (!useHugepage || virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD_HUGETLB))) {
>> + backendType = "memory-backend-memfd";
>> +
>> + if (useHugepage &&
>> + (virJSONValueObjectAdd(props, "b:hugetlb", useHugepage, NULL) < 0 ||
>> + virJSONValueObjectAdd(props, "U:hugetlbsize", pagesize << 10, NULL) < 0)) {
>> + goto cleanup;
>> + }
>> +
>> + if (qemuBuildMemoryBackendPropsShare(props, memAccess) < 0) {
>> + goto cleanup;
>> + }
>
> Running syntax-check would fail because of the { }
Sorry, I keep mixing with the QEMU coding style...
>
>> +
>> + } else if (useHugepage || mem->nvdimmPath || memAccess ||
>> def->mem.source == VIR_DOMAIN_MEMORY_SOURCE_FILE) {
>>
>> if (mem->nvdimmPath) {
>> @@ -3297,20 +3331,8 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
>> goto cleanup;
>> }
>>
>> - switch (memAccess) {
>> - case VIR_DOMAIN_MEMORY_ACCESS_SHARED:
>> - if (virJSONValueObjectAdd(props, "b:share", true, NULL) < 0)
>> - goto cleanup;
>> - break;
>> -
>> - case VIR_DOMAIN_MEMORY_ACCESS_PRIVATE:
>> - if (virJSONValueObjectAdd(props, "b:share", false, NULL) < 0)
>> - goto cleanup;
>> - break;
>> -
>> - case VIR_DOMAIN_MEMORY_ACCESS_DEFAULT:
>> - case VIR_DOMAIN_MEMORY_ACCESS_LAST:
>> - break;
>> + if (qemuBuildMemoryBackendPropsShare(props, memAccess) < 0) {
>> + goto cleanup;
>> }
>
> Running syntax-check would fail here because of the { }
>
> All this is fix-able without you needing to post another series, but I
> need you to provide the verbiage for the intro and perhaps something
> that could be added to the web page. I can adjust the patch accordingly.
>
> Assuming of course Michal doesn't have other reservations...
>
> Reviewed-by: John Ferlan <jferlan at redhat.com>
If you already resolved patch 1 & 2 conflicts, I would appreciate if
you could take care. Otherwise I'll have to rebase & resend the
patches.
thanks
>
> John
>
>> } else {
>> backendType = "memory-backend-ram";
>> @@ -7609,7 +7631,8 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg,
>>
>> if (virDomainNumatuneHasPerNodeBinding(def->numa) &&
>> !(virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_RAM) ||
>> - virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE))) {
>> + virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE) ||
>> + virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD))) {
>> virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>> _("Per-node memory binding is not supported "
>> "with this QEMU"));
>> @@ -7618,7 +7641,8 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg,
>>
>> if (def->mem.nhugepages &&
>> def->mem.hugepages[0].size != system_page_size &&
>> - !virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE)) {
>> + !(virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE) ||
>> + virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD_HUGETLB))) {
>> virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>> _("huge pages per NUMA node are not "
>> "supported with this QEMU"));
>> @@ -7635,7 +7659,8 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg,
>> * need to check which approach to use */
>> for (i = 0; i < ncells; i++) {
>> if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_RAM) ||
>> - virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE)) {
>> + virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE) ||
>> + virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD)) {
>>
>> if ((rc = qemuBuildMemoryCellBackendStr(def, cfg, i, priv,
>> &nodeBackends[i])) < 0)
>>
More information about the libvir-list
mailing list