[vfio-users] Best pinning strategy for latency / performance trade-off

Thomas Lindroth thomas.lindroth at gmail.com
Wed Feb 8 09:56:35 UTC 2017


On 02/06/2017 11:08 PM, Abdulla Bubshait wrote:
> Hey Thomas,
> 
> Great analysis, thanks for sharing it all. But I had a question. You
> mention setting realtime pri on the VM threads, and how it works with
> NO_HZ_FULL. And I was wondering wouldn't you be able to to just use
> isolcpus in cmd line to leave the VM cores to the VM as an alternative?

In this test I rely on cpuset to isolate the VM cores for exclusive use by
the VM but cpuset can't completely isolate cores. There are always some
kernel threads remaining that might interfere with the VM. I've never
tested isolcpus but I've read that is has the same problem and can't
migrate all kthreads. Cpuset can be turned on and off dynamical so it's a
lot more convenient.

To find out what processes run on a core try running.
"perf record -e "sched:sched_switch" -C 1,2,3" where -C is the cores you
want to inspect. Let it run for a while and then run
"perf report --fields=sample,overhead,cpu,comm". If you have isolated the
cores so only the VM may run on them you'd expect to only see the
"CPU #/KVM" processes but there will probably be other stuff running as
well. Swapper is the kernel thread doing busy wait so it's normal to see
that. Kvm-pit is some kernel thread for handling the virtual pit so if the
VM use the pit you'll see that. Besides those you'll probably also see a
bunch of kworker threads. By using realtime pri the VM threads can
out-preempt them.

https://www.kernel.org/doc/Documentation/kernel-per-CPU-kthreads.txt
This kernel doc describe a lot of tricks for disabling those kernel threads
that can't be migrated. The tricks I rely on in this test was to move all
interrupts to the housekeeping cpu and run the script below.

# the kernel's dirty page writeback mechanism uses kthread workers. They introduce
# massive arbitrary latencies when doing disk writes on the host and aren't
# migrated by cset. Restrict the workqueues to use only cpu 0.
echo 1 > /sys/devices/virtual/workqueue/cpumask

# The CONFIG_LOCKUP_DETECTOR watchdog will wake up occasionally resulting in jitter
# temporary disable it.
echo 0 > /proc/sys/kernel/watchdog

# The vmstat_update worker can't be disabled but it can be delayed a bit.
# makes the statistics in /proc/vmstat imprecise.
echo 300 > /proc/sys/vm/stat_interval

# THP can allegedly result in OS jitter. Better keep it off.
echo never > /sys/kernel/mm/transparent_hugepage/enabled

I also compile my kernel with these options.
CONFIG_RCU_NOCB_CPU_ALL
Makes the RCU callbacks run in kthreads which can be migrated to
housekeeping cpu

CONFIG_WQ_POWER_EFFICIENT_DEFAULT:
The documentation for this option suggest it makes it possible to
migrate more kthreads but I haven't confirmed that.

CONFIG_NO_HZ_FULL_ALL
Turns off the kernel tick on cpus with only one process to make that
process run uninterrupted.




More information about the vfio-users mailing list