[vfio-users] Soft lockup on archlinux 4.1.10-1-vfio-lts kernel

Jon Panozzo jonp at lime-technology.com
Fri Oct 23 18:59:03 UTC 2015


I have a Gigabyte board in my situation…

> On Oct 23, 2015, at 1:51 PM, Lucas Kückelhaus <lucas at kuckelhaus.com> wrote:
> 
> One thing I noticed is that we all do seem to have ASROCK motherboards as Mark mentioned. I am hesitant to perform a bios upgrade, however. VT-D is finicky enough as is. I can try 4.1.11 later tonight and see if it helps.
> 
> Regards,
> Lucas Kückelhaus
> 
> On 2015-10-23 15:54, Dan Ziemba wrote:
>> Well, old systemd and dbus didn't help. System was locked up again
>> this morning.  Left the screen on tailing dmesg, but there was nothing
>> interesting output.  I've got a PKGBUILD for 4.1.11 coming later
>> today, so maybe that will help.
>> Dan
>> On Oct 22, 2015 10:53 PM, "Dan Ziemba" <zman0900 at gmail.com> wrote:
>>> Hey,
>>> I maintain that PKGBUILD. I think I've been having the same
>>> problem,
>>> but it seems to also happen if I reinstall the older linux-vfio
>>> 4.1.6.
>>> Here's the latest stack trace I was able to capture:
>>> https://i.imgur.co [1]
>>> m/FZkj4ib.jpg I had to disable the screen timeout so it would stay
>>> on
>>> all night with dmesg tailing and I found it like this in the
>>> morning.
>>> Mouse and caps lock still worked, but I couldn't actually do
>>> anything
>>> and the clock was frozen.
>>> I was also noticing that booting my system was unreliable. If I
>>> would
>>> reboot several times in a row, once every two to three time, it
>>> would
>>> hang while starting various services and then never start gdm.
>>> Today I tried downgrading systemd and dbus to just before the
>>> change
>>> that switched to user buses (See here:
>>> https://www.archlinux.org/news/d
>>> -bus-now-launches-user-buses/ ;) I reboot a whole bunch of times
>>> using
>>> 4.1.10 linux-vfio-lts and it seems reliable. I have been using the
>>> computer pretty much all day for work and it hasn't had any of the
>>> soft
>>> lockup yet, but it may be too soon to tell. Most of the time in
>>> the
>>> past the lockup would happen while idle.
>>> These are the downgrades I made, everything else is up to date as
>>> of
>>> this morning.
>>> [2015-10-22 12:22] [ALPM] transaction started
>>> [2015-10-22 12:22] [ALPM] downgraded libsystemd (227-1 -> 225-1)
>>> [2015-10-22 12:22] [ALPM] downgraded libdbus (1.10.0-4 -> 1.10.0-2)
>>> [2015-10-22 12:22] [ALPM] downgraded dbus (1.10.0-4 -> 1.10.0-2)
>>> [2015-10-22 12:22] [ALPM] downgraded systemd (227-1 -> 225-1)
>>> [2015-10-22 12:22] [ALPM] downgraded lib32-systemd (227-1 -> 225-1)
>>> [2015-10-22 12:22] [ALPM] downgraded systemd-sysvcompat (227-1 ->
>>> 225-1)
>>> [2015-10-22 12:22] [ALPM] transaction completed
>>> I will follow up tomorrow with whether or not it locks up tonight.
>>> If
>>> we can isolate the problem to systemd or dbus, maybe that's at
>>> least
>>> good enough for a bug report.
>>> Dan
>>> -----Original Message-----
>>> From: Lucas Kückelhaus <lucas at kuckelhaus.com>
>>> To: vfio-users at redhat.com
>>> Subject: [vfio-users] Soft lockup on archlinux 4.1.10-1-vfio-lts
>>> kernel
>>> Date: Thu, 22 Oct 2015 23:00:37 -0200
>>> Mailer: Roundcube Webmail/1.0.2
>>> Hi,
>>> I'm trying to run an Archlinux host on kernel 4.1.10-1-vfio-lts
>>> (Mark
>>> Weiman's custom repo) because I'm unable to boot a GPU-assigned VM
>>> on
>>> 4.2.3-1-vfio.
>>> The VM boots fine and works for a while, but the computer
>>> sporadically
>>> crashes with the following:
>>> Oct 22 21:43:37 kvmhost kernel: NMI watchdog: BUG: soft lockup -
>>> CPU#4
>>> stuck for 22s! [swapper/4:0]
>>> Oct 22 21:43:39 kvmhost kernel: Modules linked in: veth vhost_net
>>> vhost
>>> macvtap macvlan tun bridge stp llc nls_iso8859_1 nls_cp437 vfat fat
>>> iTCO_wdt iTCO_vendor_support nouveau snd_hda_codec_hdmi intel_rapl
>>> iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp mxm_wmi
>>> snd_hda_
>>> Oct 22 21:43:39 kvmhost kernel: sch_fq_codel fuse nfsd nfs
>>> auth_rpcgss
>>> oid_registry nfs_acl lockd grace sunrpc fscache ip_tables x_tables
>>> ext4
>>> crc16 mbcache jbd2 dm_mod hid_logitech_hidpp hid_logitech_dj
>>> hid_generic
>>> usbhid hid sd_mod uas usb_storage atkbd libps2 crc32c_intel ah
>>> Oct 22 21:43:39 kvmhost kernel: CPU: 4 PID: 0 Comm: swapper/4
>>> Tainted: G
>>> L 4.1.10-1-vfio-lts #1
>>> Oct 22 21:43:39 kvmhost kernel: Hardware name: To Be Filled By
>>> O.E.M. To
>>> Be Filled By O.E.M./Z77 Extreme4, BIOS P2.30 09/21/2012
>>> Oct 22 21:43:39 kvmhost kernel: task: ffff88080b119460 ti:
>>> ffff88080b124000 task.ti: ffff88080b124000
>>> Oct 22 21:43:39 kvmhost kernel: RIP: 0010:[<ffffffff810f6770>]
>>> [<ffffffff810f6770>] try_to_del_timer_sync+0x0/0xa0
>>> Oct 22 21:43:39 kvmhost kernel: RSP: 0018:ffff88082f303db0 EFLAGS:
>>> 00000286
>>> Oct 22 21:43:39 kvmhost kernel: RAX: 00000000ffffffff RBX:
>>> 0000000000000286 RCX: 0000000000000000
>>> Oct 22 21:43:39 kvmhost kernel: RDX: 00000000000000bf RSI:
>>> 0000000000000286 RDI: ffff880270fa8428
>>> Oct 22 21:43:39 kvmhost kernel: RBP: ffff88082f303dc8 R08:
>>> 0000000000002710 R09: ffff88082f30e780
>>> Oct 22 21:43:39 kvmhost kernel: R10: 0000000000000000 R11:
>>> 0000000000000004 R12: ffff88082f303d28
>>> Oct 22 21:43:39 kvmhost kernel: R13: ffffffff815f13de R14:
>>> ffff88082f303dc8 R15: ffff880270fa8428
>>> Oct 22 21:43:39 kvmhost kernel: FS: 0000000000000000(0000)
>>> GS:ffff88082f300000(0000) knlGS:0000000000000000
>>> Oct 22 21:43:39 kvmhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
>>> 0000000080050033
>>> Oct 22 21:43:39 kvmhost kernel: CR2: 00007fc2d6f6da28 CR3:
>>> 000000029c65c000 CR4: 00000000001426e0
>>> Oct 22 21:43:39 kvmhost kernel: Stack:
>>> Oct 22 21:43:39 kvmhost kernel: ffffffff810f6872 ffff88082f303e38
>>> ffff880270fa8390 ffff88082f303df8
>>> Oct 22 21:43:39 kvmhost kernel: ffffffff8152a16f ffff880270fa8390
>>> ffff8805b3bab800 ffff880270d20000
>>> Oct 22 21:43:39 kvmhost kernel: 0000000000000001 ffff88082f303e38
>>> ffffffff8152a3e7 ffff88082f3107e0
>>> Oct 22 21:43:39 kvmhost kernel: Call Trace:
>>> Oct 22 21:43:39 kvmhost kernel: <IRQ>
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff810f6872>] ?
>>> del_timer_sync+0x62/0x70
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a16f>]
>>> inet_csk_reqsk_queue_drop+0xbf/0x240
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a3e7>]
>>> reqsk_timer_handler+0xf7/0x2e0
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a2f0>] ?
>>> inet_csk_reqsk_queue_drop+0x240/0x240
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff810f64c8>]
>>> call_timer_fn+0x48/0x160
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a2f0>] ?
>>> inet_csk_reqsk_queue_drop+0x240/0x240
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff810f6bd4>]
>>> run_timer_softirq+0x284/0x330
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81086711>]
>>> __do_softirq+0xf1/0x2e0
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81086acd>]
>>> irq_exit+0xbd/0xc0
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff815f31d5>]
>>> smp_apic_timer_interrupt+0x55/0x70
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff815f13de>]
>>> apic_timer_interrupt+0x6e/0x80
>>> Oct 22 21:43:39 kvmhost kernel: <EOI>
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81021c1d>] ?
>>> native_sched_clock+0x2d/0xa0
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81490c81>] ?
>>> cpuidle_enter_state+0xa1/0x250
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81490c53>] ?
>>> cpuidle_enter_state+0x73/0x250
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81490e8a>]
>>> cpuidle_enter+0x2a/0x30
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff810cb36c>]
>>> cpu_startup_entry+0x32c/0x460
>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81055f7e>]
>>> start_secondary+0x19e/0x1e0
>>> Oct 22 21:43:39 kvmhost kernel: Code: 4d d8 65 48 33 0c 25 28 00 00
>>> 00
>>> 44 89 e0 75 0b 48 83 c4 18 5b 41 5c 41 5d 5d c3 e8 1b b8 f8 ff 90
>>> 66 2e
>>> 0f 1f 84 00 00 00 00 00 <0f> 1f 44 00 00 55 48 89 e5 41 54 53 48 81
>>> ec
>>> 30 10 00 00 48 83
>>> This happens for all cores and it locks up the entire system. I
>>> don't
>>> know what to do. On 4.2.3-1-vfio I have no hangups and all my
>>> non-vfio
>>> VMs work perfectly fine.
>>> Thank you,
>>> Lucas Kückelhaus
>>> _______________________________________________
>>> vfio-users mailing list
>>> vfio-users at redhat.com
>>> https://www.redhat.com/mailman/listinfo/vfio-users [2]
>> Links:
>> ------
>> [1] https://i.imgur.co
>> [2] https://www.redhat.com/mailman/listinfo/vfio-users
> 
> _______________________________________________
> vfio-users mailing list
> vfio-users at redhat.com
> https://www.redhat.com/mailman/listinfo/vfio-users





More information about the vfio-users mailing list