[libvirt] [PATCH 05/14] qemu: Implement NVDIMM

Michal Privoznik mprivozn at redhat.com
Thu Feb 23 13:07:27 UTC 2017


On 02/23/2017 11:48 AM, Daniel P. Berrange wrote:
> On Thu, Feb 23, 2017 at 11:16:17AM +0100, Michal Privoznik wrote:
>> On 02/23/2017 11:02 AM, Daniel P. Berrange wrote:
>>> On Thu, Feb 23, 2017 at 10:02:48AM +0100, Michal Privoznik wrote:
>>>> So, majority of the code is just ready as-is. Well, with one
>>>> slight change: differentiate between dimm and nvdimm in places
>>>> like device alias generation, generating the command line and so
>>>> on.
>>>>
>>>> Speaking of the command line, we also need to append 'nvdimm=on'
>>>> to the '-machine' argument so that the nvdimm feature is
>>>> advertised in the ACPI tables properly.
>>>>
>>>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>>>
>>>
>>>> +        if (prealloc &&
>>>> +            virJSONValueObjectAdd(props,
>>>> +                                  "b:prealloc", true,
>>>> +                                  NULL) < 0)
>>>> +            goto cleanup;
>>>
>>> As discussed on IRC, using prealloc with QEMU causes it to memset()
>>> the first byte of every page of memory to 0. With NVDIMM this is
>>> obviously corrupting application data stored in the NVDIMM.
>>>
>>> Obviously this needs fixing in QEMU, but I would think that libvirt
>>> needs to block use of prealloc==true when running against existing
>>> broken QEMU versions, otherwise users are going to be rather upset
>>> to see their data corrupted on every boot
>>
>> They are, but should we work around broken QEMUs? And if so - how do we
>> detect whether the qemu we are dealing with is broken or already fixed
>> in that respect?
> 
> Most likely we'll just have todo a version number check I think, since
> fixing this isn't going to involve any externally visible change to
> QEMU.

What about backports then? If some distro decides to backport your fix
posted on the qemu list, this check here would prevent them from running
fixed qemu. Also, I don't think we want to work around qemu bugs.


> 
>> On the other hand, we can just not set prealloc true for nvdimm. That
>> will have one downside though - after qemu mmaps() the file, kernel is
>> not forced to create a private copy of the pages.
> 
> If the guest has requested prealloc, then I don't think we can ignore
> it for NVDIMM, because its a clear violation of the memory guarantee
> we're claiming to provide apps. Thus I think we've no option but to
> report an error if we see prealloc + broken QEMU

Do you mean user instead of guest? Because it's user who requests
prealloc (well, in theory it is user; libvirt does not allow users to
request prealloc but rather has some rules worked in that request it
instead).

Michal




More information about the libvir-list mailing list