[vfio-users] Brutal DPC Latency - how is yours? check it please and report back

Milos Kaurin milos.kaurin at gmail.com
Mon Jan 11 14:06:13 UTC 2016


Hello,

Yes, I have a corei7.

I have to admit that seeing Quentin's e-mail was the first I found out
about DPC latency. I'm taking a strictly empirical approach for now,
but I'd like to dive deeper into this, at least to provide a reference
point for you guys.
Reason for this being is that even though I'm familiar with Linux, I'm
don't have low-level familiarity as you guys have (other than
conceptual). I'm more than willing to learn given the opportunity,
though.

Quentin:
>From what I understand about your use:
* You have an AMD CPU
* In your kernel parameters, you are trying to offload your
scheduling-clock interrupts to only thread(core?) 0.
* Your script sets kernel memory management, future tasks and current
tasks to be run at thread 0
* Valley bench seems to be most sensitive to DPC latency issues (as
well as "Heroes of the storm")
* Pinning only 3 cores to the VM gives you best results, but seeing
that newer games take advantage of multiple cores, you'd like to have
an option to use more cores for winVirt

What I'd like from you:
* Can you provide me with the optimal (3core -> VM) settings,
including kernel parameters, your updated script and the XML of your
virt in this mode of use.
* Can you provide me with a method how to keep track of DPC latency? I
found this: http://www.thesycon.de/deu/latency_check.shtml , but I'd
like us to use the same method.

Why I'm asking all of this:
Just ran valley (HD extreme). These are the results:

 *Bare-metal:
FPS: 48.7
Score: 2036
Min FPS: 23.4
Max FPS:90.6

* hugetables, nopin, 1x4x2, host-passthrough:
FPS: 47.9
Score: 2005
Min FPS: 19.7
Max FPS: 91.5

The score is ~1.5 % worse in the virt.
The min FPS difference (which looks significant) might be negligible
because I'm running Firefox in the host with a bunch of tabs open
(idle, though)

I have also been playing "Rocket League" in the virt which is a very
twitchy game, and I play it on an experienced level. I did not find
any problems with playing the game like this.

My current XML: https://gist.github.com/Kaurin/0b6726e8a94084bd0b64
PCI devices passed through: nvidia+HDMI audio, onboard sound, onboard
XHCI USB controller

Notes about my setup:
* Both virt and host are hooked up to the same monitor (host-VGA / virt - DVI).
* I also don't have any additional USB controllers, which means that
when I turn on the virt, I lose my usb(mouse,keyboard) on the host
* Same goes for sound: when I turn on the virt, I lose sound in the host
* I just flip the monitor input and I'm good to go.
* I have plans to set up new hardware so I can use both host/virt at
the same time

Let me know if my further input would be useful.

Regards,
Milos



On Mon, Jan 11, 2016 at 9:19 AM, Quentin Deldycke
<quentindeldycke at gmail.com> wrote:
> In fact, some games react quite well to this latency. Fallout for example
> doesn't show much difference between host - vm with brutal DPC and vm with
> "good dpc".
>
> I tested 3 modes:
>
> - all 8 core to vm without pinning: brutal dpc, did not tried to play games
> on it. Only ungine valley => 2600 points
> - 6 cores pinned to the vm + emulator on core 0,1: correct latency. Most
> games work flawlessly (bf4 / battlefront / diablo III) but some are
> catastrophic: Heroes of the storm. valley => 2700
> - 3 cores pinned to vm: Perfect latency, all games work ok. But i am affraid
> 3 cores are a bit 'not enough" for incoming games. valley => 3100 points
>
> I think that valley is  a good benchmark. It is free and small. It seems to
> be affected by this latency problem like most games.
>
>
>
>
> --
> Deldycke Quentin
>
>
> On 11 January 2016 at 09:59, rndbit <rndbit at sysret.net> wrote:
>>
>> Tried Milos' config too - DPC latency got worse. I use AMD cpu though so
>> its hardly comparable.
>> One thing to note is that both VM and bare metal (same OS) score around 5k
>> points in 3dmark fire strike test (VM 300 points less). Sounds not too bad
>> but in reality bf4 is pretty much unplayable in VM due to bad performance
>> and sound glitches while playing it on bare metal is just fine. Again DPC
>> latency on bare metal even under load is ok - occasional spike here and
>> there but mostly its within norm. Any kind of load on VM makes DPC go nuts
>> and performance is terrible. I even tried isolcpus=4,5,6,7 and binding vm to
>> those free cores - its all the same.
>>
>> Interesting observation is that i used to play titanfall without a hitch
>> in VM some time in the past, 3.10 kernel or so (no patches). When i get free
>> moment ill try downgrading kernel, maybe problem is there.
>>
>>
>> On 2016.01.11 10:39, Quentin Deldycke wrote:
>>
>> Also, i juste saw something:
>>
>> You use ultra (4k?) settings on a 770gtx. This is too heavy for it. You
>> have less than 10fps. So in fact if you loose let's say 10% of performance,
>> you will barely see it.
>>
>> What we search is a very high reponse time. Could you please compare your
>> system with a less heavy benchmark. It is easier to see the difference at
>> ~50-70 fps.
>>
>> In my case, this configuration work. But my fps fluctuate quite a lot. If
>> you are a bit a serious gamer, this falls are not an option during game :)
>>
>> --
>> Deldycke Quentin
>>
>>
>> On 11 January 2016 at 08:54, Quentin Deldycke <quentindeldycke at gmail.com>
>> wrote:
>>>
>>> Using this mode,
>>>
>>> DPC Latency is hugely buggy using this mode.
>>>
>>> My fps are also moving on an apocaliptic way: from 80 to 45 fps without
>>> moving on ungine valley.
>>>
>>> Do you have anything working on your linux? (i have plasma doing nothing
>>> on another screen)
>>>
>>> Ungine heaven went back to 2600 points from 3100
>>> Cinebench r15: single core 124
>>>
>>>
>>> Could you please send your whole xml file, qemu version and kernel config
>>> / boot?
>>>
>>> I will try to get 3dmark and verify host / virtual comparison
>>>
>>> --
>>> Deldycke Quentin
>>>
>>>
>>> On 9 January 2016 at 20:24, Milos Kaurin <milos.kaurin at gmail.com> wrote:
>>>>
>>>> My details:
>>>> Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
>>>> 32GB total ram
>>>> hugetables at 16x1GB for the guest (didn't have much to do with 3dmark
>>>> results)
>>>>
>>>> I have had the best performance with:
>>>>
>>>>   <vcpu placement='static'>8</vcpu>
>>>>   <cpu mode='custom' match='exact'>
>>>>     <model fallback='allow'>host-passthrough</model>
>>>>     <topology sockets='1' cores='4' threads='2'/>
>>>>   </cpu>
>>>>
>>>> No CPU pinning on either guest or host
>>>>
>>>> Benchmark example (Bare metal Win10 vs Fedora Guest Win10)
>>>> http://www.3dmark.com/compare/fs/7076732/fs/7076627#
>>>>
>>>>
>>>> Could you try my settings and report back?
>>>>
>>>> On Sat, Jan 9, 2016 at 3:14 PM, Quentin Deldycke
>>>> <quentindeldycke at gmail.com> wrote:
>>>> > I use virsh:
>>>> >
>>>> > ===SNIP===
>>>> >   <vcpu placement='static'>3</vcpu>
>>>> >   <cputune>
>>>> >     <vcpupin vcpu='0' cpuset='1'/>
>>>> >     <vcpupin vcpu='1' cpuset='2'/>
>>>> >     <vcpupin vcpu='2' cpuset='3'/>
>>>> >     <emulatorpin cpuset='6-7'/>
>>>> >   </cputune>
>>>> > ===SNAP===
>>>> >
>>>> > I have a prepare script running:
>>>> >
>>>> > ===SNIP===
>>>> > sudo mkdir /cpuset
>>>> > sudo mount -t cpuset none /cpuset/
>>>> > cd /cpuset
>>>> > echo 0 | sudo tee -a cpuset.cpu_exclusive
>>>> > echo 0 | sudo tee -a cpuset.mem_exclusive
>>>> >
>>>> > sudo mkdir sys
>>>> > echo 'Building shield for core system... threads 0 and 4, and we place
>>>> > all
>>>> > runnning tasks there'
>>>> > /bin/echo 0,4 | sudo tee -a sys/cpuset.cpus
>>>> > /bin/echo 0 | sudo tee -a sys/cpuset.mems
>>>> > /bin/echo 0 | sudo tee -a sys/cpuset.cpu_exclusive
>>>> > /bin/echo 0 | sudo tee -a sys/cpuset.mem_exclusive
>>>> > for T in `cat tasks`; do sudo bash -c "/bin/echo $T >
>>>> > sys/tasks">/dev/null
>>>> > 2>&1 ; done
>>>> > cd -
>>>> > ===SNAP===
>>>> >
>>>> > Note that i use this command line for the kernel
>>>> > nohz_full=1,2,3,4,5,6,7 rcu_nocbs=1,2,3,4,5,6,7 default_hugepagesz=1G
>>>> > hugepagesz=1G hugepages=12
>>>> >
>>>> >
>>>> > --
>>>> > Deldycke Quentin
>>>> >
>>>> >
>>>> > On 9 January 2016 at 15:40, rndbit <rndbit at sysret.net> wrote:
>>>> >>
>>>> >> Mind posting actual commands how you achieved this?
>>>> >>
>>>> >> All im doing now is this:
>>>> >>
>>>> >> cset set -c 0-3 system
>>>> >> cset proc -m -f root -t system -k
>>>> >>
>>>> >>   <vcpu placement='static'>4</vcpu>
>>>> >>   <cputune>
>>>> >>     <vcpupin vcpu='0' cpuset='4'/>
>>>> >>     <vcpupin vcpu='1' cpuset='5'/>
>>>> >>     <vcpupin vcpu='2' cpuset='6'/>
>>>> >>     <vcpupin vcpu='3' cpuset='7'/>
>>>> >>     <emulatorpin cpuset='0-3'/>
>>>> >>   </cputune>
>>>> >>
>>>> >> Basically this puts most of threads to 0-3 cores including emulator
>>>> >> threads. Some threads cant be moved though so they remain on 4-7
>>>> >> cores. VM
>>>> >> is given 4-7 cores. It works better but there is still much to be
>>>> >> desired.
>>>> >>
>>>> >>
>>>> >>
>>>> >> On 2016.01.09 15:59, Quentin Deldycke wrote:
>>>> >>
>>>> >> Hello,
>>>> >>
>>>> >> Using cpuset, i was using the vm with:
>>>> >>
>>>> >> Core 0: threads 0 & 4: linux + emulator pin
>>>> >> Core 1,2,3: threads 1,2,3,5,6,7: windows
>>>> >>
>>>> >> I tested with:
>>>> >> Core 0: threads 0 & 4: linux
>>>> >> Core 1,2,3: threads 1,2,3: windows
>>>> >> Core 1,2,3: threads 5,6,7: emulator
>>>> >>
>>>> >> The difference between both is huge (DPC latency is mush more
>>>> >> stable):
>>>> >> Performance on single core went up to 50% (cinebench ratio by core
>>>> >> from
>>>> >> 100 to 150 points)
>>>> >> Performance on gpu went up to 20% (cinebench from 80fps to 100+)
>>>> >> Performance on "heroes of the storm" went from 20~30 fps to stable 60
>>>> >> (and
>>>> >> much time more than 100)
>>>> >>
>>>> >> (performance of Unigine Heaven went from 2700 points to 3100 points)
>>>> >>
>>>> >> The only sad thing is that i have the 3 idle threads which are barely
>>>> >> used... Is there any way to put them back to windows?
>>>> >>
>>>> >> --
>>>> >> Deldycke Quentin
>>>> >>
>>>> >>
>>>> >> On 29 December 2015 at 17:38, Michael Bauer <michael at m-bauer.org>
>>>> >> wrote:
>>>> >>>
>>>> >>> I noticed that attaching a DVD-Drive from the host leads to HUGE
>>>> >>> delays.
>>>> >>> I had attached my /dev/sr0 to the guest and even without a DVD in
>>>> >>> the drive
>>>> >>> this was causing huge lag about once per second.
>>>> >>>
>>>> >>> Best regards
>>>> >>> Michael
>>>> >>>
>>>> >>>
>>>> >>> Am 28.12.2015 um 19:30 schrieb rndbit:
>>>> >>>
>>>> >>> 4000μs-16000μs here, its terrible.
>>>> >>> Tried whats said on
>>>> >>> https://lime-technology.com/forum/index.php?topic=43126.15
>>>> >>> Its a bit better with this:
>>>> >>>
>>>> >>>   <vcpu placement='static'>4</vcpu>
>>>> >>>   <cputune>
>>>> >>>     <vcpupin vcpu='0' cpuset='4'/>
>>>> >>>     <vcpupin vcpu='1' cpuset='5'/>
>>>> >>>     <vcpupin vcpu='2' cpuset='6'/>
>>>> >>>     <vcpupin vcpu='3' cpuset='7'/>
>>>> >>>     <emulatorpin cpuset='0-3'/>
>>>> >>>   </cputune>
>>>> >>>
>>>> >>> I tried isolcpus but it did not yield visible benefits. ndis.sys is
>>>> >>> big
>>>> >>> offender here but i dont really understand why. Removing network
>>>> >>> interface
>>>> >>> from VM makes usbport.sys take over as biggest offender. All this
>>>> >>> happens
>>>> >>> with performance governor of all cpu cores:
>>>> >>>
>>>> >>> echo performance | tee
>>>> >>> /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor >/dev/null
>>>> >>>
>>>> >>> Cores remain clocked at 4k mhz. I dont know what else i could try.
>>>> >>> Does
>>>> >>> anyone have any ideas..?
>>>> >>>
>>>> >>> On 2015.10.29 08:03, Eddie Yen wrote:
>>>> >>>
>>>> >>> I tested again with VM reboot, I found that this time is about
>>>> >>> 1000~1500μs.
>>>> >>> Also I found that it easily get high while hard drive is loading,
>>>> >>> but
>>>> >>> only few times.
>>>> >>>
>>>> >>> Which specs you're using? Maybe it depends on CPU or patches.
>>>> >>>
>>>> >>> 2015-10-29 13:44 GMT+08:00 Blank Field <ihatethisfield at gmail.com>:
>>>> >>>>
>>>> >>>> If i understand it right, this software has a fixed latency error
>>>> >>>> of 1
>>>> >>>> ms(1000us) in windows 8-10 due to different kernel timer
>>>> >>>> implementation. So
>>>> >>>> i guess your latency is very good.
>>>> >>>>
>>>> >>>> On Oct 29, 2015 8:40 AM, "Eddie Yen" <missile0407 at gmail.com> wrote:
>>>> >>>>>
>>>> >>>>> Thanks for information! And sorry I don'r read carefully at
>>>> >>>>> beginning
>>>> >>>>> message.
>>>> >>>>>
>>>> >>>>> For my result, I got about 1000μs below and only few times got
>>>> >>>>> 1000μs
>>>> >>>>> above when idling.
>>>> >>>>>
>>>> >>>>> I'm using 4820K and used 4 threads to VM, also  I set these 4
>>>> >>>>> threads
>>>> >>>>> as 4 cores in VM settings.
>>>> >>>>> The OS is Windows 10.
>>>> >>>>>
>>>> >>>>> 2015-10-29 13:21 GMT+08:00 Blank Field <ihatethisfield at gmail.com>:
>>>> >>>>>>
>>>> >>>>>> I think they're using this:
>>>> >>>>>> www.thesycon.de/deu/latency_check.shtml
>>>> >>>>>>
>>>> >>>>>> On Oct 29, 2015 6:11 AM, "Eddie Yen" <missile0407 at gmail.com>
>>>> >>>>>> wrote:
>>>> >>>>>>>
>>>> >>>>>>> Sorry, but how to check DPC Latency?
>>>> >>>>>>>
>>>> >>>>>>> 2015-10-29 10:08 GMT+08:00 Nick Sukharev
>>>> >>>>>>> <nicksukharev at gmail.com>:
>>>> >>>>>>>>
>>>> >>>>>>>> I just checked on W7 and I get 3000μs-4000μs one one of the
>>>> >>>>>>>> guests
>>>> >>>>>>>> when 3 guests are running.
>>>> >>>>>>>>
>>>> >>>>>>>> On Wed, Oct 28, 2015 at 4:52 AM, Sergey Vlasov
>>>> >>>>>>>> <sergey at vlasov.me>
>>>> >>>>>>>> wrote:
>>>> >>>>>>>>>
>>>> >>>>>>>>> On 27 October 2015 at 18:38, LordZiru <lordziru at gmail.com>
>>>> >>>>>>>>> wrote:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> I have brutal DPC Latency on qemu, no matter if using
>>>> >>>>>>>>>> pci-assign
>>>> >>>>>>>>>> or vfio-pci or without any passthrought,
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> my DPC Latency is like:
>>>> >>>>>>>>>> 10000,500,8000,6000,800,300,12000,9000,700,2000,9000
>>>> >>>>>>>>>> and on native windows 7 is like:
>>>> >>>>>>>>>> 20,30,20,50,20,30,20,20,30
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> In Windows 10 guest I constantly have red bars around 3000μs
>>>> >>>>>>>>> (microseconds), spiking sometimes up to 10000μs.
>>>> >>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> I don't know how to fix it.
>>>> >>>>>>>>>> this matter for me because i are using USB Sound Card for my
>>>> >>>>>>>>>> VMs,
>>>> >>>>>>>>>> and i get sound drop-outs every 0-4 secounds
>>>> >>>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> That bugs me a lot too. I also use an external USB card and my
>>>> >>>>>>>>> DAW
>>>> >>>>>>>>> periodically drops out :(
>>>> >>>>>>>>>
>>>> >>>>>>>>> I haven't tried CPU pinning yet though. And perhaps I should
>>>> >>>>>>>>> try
>>>> >>>>>>>>> Windows 7.
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> _______________________________________________
>>>> >>>>>>>>> vfio-users mailing list
>>>> >>>>>>>>> vfio-users at redhat.com
>>>> >>>>>>>>> https://www.redhat.com/mailman/listinfo/vfio-users
>>>> >>>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>> _______________________________________________
>>>> >>>>>>>> vfio-users mailing list
>>>> >>>>>>>> vfio-users at redhat.com
>>>> >>>>>>>> https://www.redhat.com/mailman/listinfo/vfio-users
>>>> >>>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> _______________________________________________
>>>> >>>>>>> vfio-users mailing list
>>>> >>>>>>> vfio-users at redhat.com
>>>> >>>>>>> https://www.redhat.com/mailman/listinfo/vfio-users
>>>> >>>>>>>
>>>> >>>>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> vfio-users mailing list
>>>> >>> vfio-users at redhat.com
>>>> >>> https://www.redhat.com/mailman/listinfo/vfio-users
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> vfio-users mailing list
>>>> >>> vfio-users at redhat.com
>>>> >>> https://www.redhat.com/mailman/listinfo/vfio-users
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> vfio-users mailing list
>>>> >>> vfio-users at redhat.com
>>>> >>> https://www.redhat.com/mailman/listinfo/vfio-users
>>>> >>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> _______________________________________________
>>>> >> vfio-users mailing list
>>>> >> vfio-users at redhat.com
>>>> >> https://www.redhat.com/mailman/listinfo/vfio-users
>>>> >>
>>>> >>
>>>> >>
>>>> >> _______________________________________________
>>>> >> vfio-users mailing list
>>>> >> vfio-users at redhat.com
>>>> >> https://www.redhat.com/mailman/listinfo/vfio-users
>>>> >>
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > vfio-users mailing list
>>>> > vfio-users at redhat.com
>>>> > https://www.redhat.com/mailman/listinfo/vfio-users
>>>> >
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> vfio-users mailing list
>> vfio-users at redhat.com
>> https://www.redhat.com/mailman/listinfo/vfio-users
>>
>>
>>
>> _______________________________________________
>> vfio-users mailing list
>> vfio-users at redhat.com
>> https://www.redhat.com/mailman/listinfo/vfio-users
>>
>
>
> _______________________________________________
> vfio-users mailing list
> vfio-users at redhat.com
> https://www.redhat.com/mailman/listinfo/vfio-users
>




More information about the vfio-users mailing list