[vfio-users] VFIO and KSM (and maybe hugepages)

Colin Godsey crgodsey at gmail.com
Mon Jun 6 17:02:45 UTC 2016


Also, is there any alternative to /proc/pid/pagemap ? I can’t get the
contents for the qemu process, guessing because of the get_user_page pins.

On Mon, Jun 6, 2016 at 11:53 AM Colin Godsey <crgodsey at gmail.com> wrote:

> I was able to reproduce the issue again by enabling KSM and THP, with
> aggressive settings for KSM. This time it appeared on one VM when the other
> hibernated itself (has also happened with them both running previously).
> The issue manifested itself first as two driver crashes (very uncommon for
> me, but ignored for this test). Final issue was the triangle-tearing I was
> seeing before. Was about the 1hr mark which is rather fast. There were ~300
> shared pages at the time, so assuming it happened upon unsharing the pages.
>
> I’ve confirmed that THP + no KSM works fine, as it should in v4.4. The
> last test I need to do is KSM with no THP. Considering these are both
> remapping modules using slightly different methods, I wouldn’t be surprised
> if there was a possibility of some sort of ref-counting issue when used
> together (combined with the long-held get_user_page for VFIO). KSM and THPs
> ref counting in 4.4 is really hard to follow, and they were both entirely
> redone in 4.5.
>
> Looking over the KSM code, THP, VFIO and various other modules… it does
> appear that get_user_pages for long-held pins is rather brittle, and the
> THP and KSM code doesnt appear to strengthen any of its guarantees.
>
> KSM does appear to have some really messy count-checking logic around its
> write_protection function, including special post-race checks looking for
> O_DIRECT.
>
> I… think I found all the /proc calls I need to catch any mapping that
> might be occurring (smaps, maps). Is there a better way, or any more
> details proc calls? Also, all IOMMU DMA regions should show in the map
> under vfio correct (for the DMAR)?
>
> System background-
>
> ubuntu ‘lowlatency’  kernel 4.4.0-22-lowlatency (voluntary kernel
> preempt, irq threads, etc). Confirmed artifacts happen on the generic
> kernel also.
> skylake with updated microcode (hopefully fixing the p-state and hang
> issues)
> Single dye, single NUMA region.
>
> On Tue, May 31, 2016 at 4:54 PM Colin Godsey <crgodsey at gmail.com> wrote:
>
>> Hmm It would have been a v4.4 build. Could very well just be coincidence
>> I haven’t seen it again.
>>
>> Either way, I appreciate the info! I’ve been wary of trying THP/KSM for a
>> while because of that, but this renews my faith. The clarification about
>> THP is also a relief. There’s still so many articles/posts that list these
>> ‘gotchas’ regarding VFIO (that have been mostly fixed), it is pretty easy
>> to go on unneeded witch hunts =\
>>
>> Regarding KSM though… as far as i can tell (in the 4.4 kernel) KSM
>> doesn’t use the normal GUP style references (uses get_user_pages_fast
>> internally): https://github.com/torvalds/linux/blob/v4.4/mm/ksm.c#L887
>>
>> I’m not horribly familiar with KSM itself, but from what I gathered of
>> its history, this may be a mechanism it uses to allow shared pages to still
>> go to swap etc.
>>
>> Also afaik KSM, THP and kswapd (@ v4.4) all manipulate the mm
>> differently, with KSM and THP having their own unique problems dealing with
>> each other. This could also have been some kind of perfect storm, using
>> basically all of the memory-mapping technologies modern linux has, at the
>> same time…
>>
>> I’ll need to step through the entire pre and post 4.5 TLB changes and see
>> if I anything looks familiar to the current KSM mapping/ref-tracking.
>>
>> Unfortunately I don’t have any other info at the time- I may try to run
>> some more isolated tests, but it was an intermittent issue that required a
>> few hours of gaming to flush out… so might take a bit there =\
>>
>> On Tue, May 31, 2016 at 3:34 PM Alex Williamson <
>> alex.williamson at redhat.com> wrote:
>>
>>> On Tue, 31 May 2016 20:20:58 +0000
>>> Colin Godsey <crgodsey at gmail.com> wrote:
>>>
>>> > I had a few questions regarding general ‘page management’ and VFIO,
>>> mostly
>>> > related to kernel shared pages.
>>> >
>>> > I have a host running 2 virtual ‘gaming rigs’ with a single dedicated
>>> GPU
>>> > each. I had an intermittent problem where when gaming (on the same
>>> game)
>>> > with both rigs, one would receive graphic artifacts. Specifically I
>>> would
>>> > see triangle/geometry artifacts which usually indicate corrupt GPU RAM.
>>> >
>>> > Both cards are so completely different, and different generation, one
>>> is
>>> > really new, I didn’t believe it was bad VRAM. Graphics drivers to swap
>>> > various buffers from system RAM to VRAM so I figured it could also be
>>> > something related to system RAM.
>>> >
>>> > I disabled any kind of… alternative page management I could- swap, KSM,
>>> > huge pages etc. and it did fix it. Because the issue only would effect
>>> one
>>> > machine, and I only observed it when the same game was running on
>>> both, I
>>> > assumed maybe it was related to KSM.
>>> >
>>> > *Is there any possible way KSM could interfere with the DMAR in some
>>> way
>>> > where it tries to share/alter DMA regions?* And broader: what prevents
>>> > systems like khugepage, kswap, and ksm from interfering with these
>>> regions
>>> > in the first place? I’ve read that transparent hugepages can interfere
>>> with
>>> > VFIO, is it safe to assume that other DMA issues could arise with other
>>> > types of page management?
>>>
>>> What kernel were you running where you saw this?  vfio uses
>>> get_user_pages to increase the reference count on pages mapped through
>>> the iommu.  This should prevent both ksm and transparent hugepages from
>>> being able to operate on the pages.  Kernel v4.5 had a bug (now fixed
>>> in v4.5.5) that did not honor the reference, allowing thp (maybe ksm
>>> too) to still operate on those pages.  So as long as you're not running
>>> v4.5.0 through v4.5.4 (or a v4.5-rc), I'm not aware of any issues with
>>> page pinning.  Thanks,
>>>
>>> Alex
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20160606/95c4c136/attachment.htm>


More information about the vfio-users mailing list