[vfio-users] Soft lockup on archlinux 4.1.10-1-vfio-lts kernel

Mark Weiman mark.weiman at markzz.com
Sat Oct 24 02:30:12 UTC 2015


I am currently building 4.1.11-2 for my repository with your suggestion
of using Arch's config files for linux-lts.

I will also post the source package used [1].  This is in case someone
wants to check it or doesn't feel comfortable using my package and
wants to build it himself/herself.

Mark Weiman

[1] http://repo.markzz.com/src/arch/markzz/linux-vfio-lts-4.1.11-2.src.
tar.gz

On Sat, 2015-10-24 at 08:28 +0700, Okky Hendriansyah wrote:
> What kind of lockup do you mean? I'm on an ASRock Z87 Extreme6 board
> and I was using your linux-lts-vfio, never had any lockups. I did
> install intel microcode though.
> 
> But then I tried to use the ABS approach with linux-lts and apply the
> patches from your PKGBUILD. Currently I'm using linux-lts with i915
> and ACS patches compiled using ABS and the host system is quote
> stable.
> 
> I noticed there're some diff lines between your linux config and the
> one from linux-lts, have you tried to use the config from official
> linux-lts?
> 
> Best regards,
> Okky Hendriansyah
> 
> > On Oct 24, 2015, at 07:50, Dan Ziemba <zman0900 at gmail.com> wrote:
> > 
> > I just released the 4.1.11 PKGBUILD.  So far so good for me, but
> > it's
> > only been running for a few hours - not really long enough to
> > tell.  
> > 
> > I do have ASRock too, but it is on nearly the latest uefi firmware.
> >  There is one newer version, but it says the only change is the
> > servers
> > used for online update.
> > 
> > I never got around to setting up the intel microcode updates, so
> > that
> > should probably be my next step.
> > 
> > Dan
> > 
> > -----Original Message-----
> > From: Mark Weiman <mark.weiman at markzz.com>
> > To: vfio-users at redhat.com
> > Subject: Re: [vfio-users] Soft lockup on archlinux 4.1.10-1-vfio-
> > lts
> > kernel
> > Date: Fri, 23 Oct 2015 18:56:39 -0400
> > 
> > To be honest, ASRock BIOS upgrades are fairly painless because they
> > can
> > be done outside of the operating system, so no need to get an image
> > of
> > FreeDOS ready.  If you do not want to get that though, I do still
> > recommend the intel-ucode package if you don't already.  As of
> > right
> > now, I have no issues running my repository's 4.1.11-1 package.
> > 
> > Mark Weiman
> > 
> > > On Fri, 2015-10-23 at 16:51 -0200, Lucas Kückelhaus wrote:
> > > One thing I noticed is that we all do seem to have ASROCK
> > > motherboards 
> > > as Mark mentioned. I am hesitant to perform a bios upgrade,
> > > however. 
> > > VT-D is finicky enough as is. I can try 4.1.11 later tonight and
> > > see
> > > if 
> > > it helps.
> > > 
> > > Regards,
> > > Lucas Kückelhaus
> > > 
> > > > On 2015-10-23 15:54, Dan Ziemba wrote:
> > > > Well, old systemd and dbus didn't help. System was locked up
> > > > again
> > > > this morning.  Left the screen on tailing dmesg, but there was
> > > > nothing
> > > > interesting output.  I've got a PKGBUILD for 4.1.11 coming
> > > > later
> > > > today, so maybe that will help.
> > > > 
> > > > Dan
> > > > > On Oct 22, 2015 10:53 PM, "Dan Ziemba" <zman0900 at gmail.com>
> > > > > wrote:
> > > > > 
> > > > > Hey,
> > > > > 
> > > > > I maintain that PKGBUILD. I think I've been having the same
> > > > > problem,
> > > > > but it seems to also happen if I reinstall the older linux-
> > > > > vfio
> > > > > 4.1.6.
> > > > > Here's the latest stack trace I was able to capture:
> > > > > https://i.imgur.co [1]
> > > > > m/FZkj4ib.jpg I had to disable the screen timeout so it would
> > > > > stay
> > > > > on
> > > > > all night with dmesg tailing and I found it like this in the
> > > > > morning.
> > > > > Mouse and caps lock still worked, but I couldn't actually do
> > > > > anything
> > > > > and the clock was frozen.
> > > > > 
> > > > > I was also noticing that booting my system was unreliable. If
> > > > > I
> > > > > would
> > > > > reboot several times in a row, once every two to three time,
> > > > > it
> > > > > would
> > > > > hang while starting various services and then never start
> > > > > gdm.
> > > > > 
> > > > > Today I tried downgrading systemd and dbus to just before the
> > > > > change
> > > > > that switched to user buses (See here:
> > > > > https://www.archlinux.org/news/d
> > > > > -bus-now-launches-user-buses/ ;) I reboot a whole bunch of
> > > > > times
> > > > > using
> > > > > 4.1.10 linux-vfio-lts and it seems reliable. I have been
> > > > > using
> > > > > the
> > > > > computer pretty much all day for work and it hasn't had any
> > > > > of
> > > > > the
> > > > > soft
> > > > > lockup yet, but it may be too soon to tell. Most of the time
> > > > > in
> > > > > the
> > > > > past the lockup would happen while idle.
> > > > > 
> > > > > These are the downgrades I made, everything else is up to
> > > > > date as
> > > > > of
> > > > > this morning.
> > > > > 
> > > > > [2015-10-22 12:22] [ALPM] transaction started
> > > > > [2015-10-22 12:22] [ALPM] downgraded libsystemd (227-1 ->
> > > > > 225-1)
> > > > > [2015-10-22 12:22] [ALPM] downgraded libdbus (1.10.0-4 ->
> > > > > 1.10.0-
> > > > > 2)
> > > > > [2015-10-22 12:22] [ALPM] downgraded dbus (1.10.0-4 ->
> > > > > 1.10.0-2)
> > > > > [2015-10-22 12:22] [ALPM] downgraded systemd (227-1 -> 225-1)
> > > > > [2015-10-22 12:22] [ALPM] downgraded lib32-systemd (227-1 ->
> > > > > 225-
> > > > > 1)
> > > > > [2015-10-22 12:22] [ALPM] downgraded systemd-sysvcompat (227-
> > > > > 1 ->
> > > > > 225-1)
> > > > > [2015-10-22 12:22] [ALPM] transaction completed
> > > > > 
> > > > > I will follow up tomorrow with whether or not it locks up
> > > > > tonight.
> > > > > If
> > > > > we can isolate the problem to systemd or dbus, maybe that's
> > > > > at
> > > > > least
> > > > > good enough for a bug report.
> > > > > 
> > > > > Dan
> > > > > 
> > > > > -----Original Message-----
> > > > > From: Lucas Kückelhaus <lucas at kuckelhaus.com>
> > > > > To: vfio-users at redhat.com
> > > > > Subject: [vfio-users] Soft lockup on archlinux 4.1.10-1-vfio-
> > > > > lts
> > > > > kernel
> > > > > Date: Thu, 22 Oct 2015 23:00:37 -0200
> > > > > Mailer: Roundcube Webmail/1.0.2
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > I'm trying to run an Archlinux host on kernel 4.1.10-1-vfio-
> > > > > lts
> > > > > (Mark
> > > > > Weiman's custom repo) because I'm unable to boot a GPU-
> > > > > assigned
> > > > > VM
> > > > > on
> > > > > 4.2.3-1-vfio.
> > > > > 
> > > > > The VM boots fine and works for a while, but the computer
> > > > > sporadically
> > > > > crashes with the following:
> > > > > 
> > > > > Oct 22 21:43:37 kvmhost kernel: NMI watchdog: BUG: soft
> > > > > lockup -
> > > > > CPU#4
> > > > > stuck for 22s! [swapper/4:0]
> > > > > Oct 22 21:43:39 kvmhost kernel: Modules linked in: veth
> > > > > vhost_net
> > > > > vhost
> > > > > macvtap macvlan tun bridge stp llc nls_iso8859_1 nls_cp437
> > > > > vfat
> > > > > fat
> > > > > iTCO_wdt iTCO_vendor_support nouveau snd_hda_codec_hdmi
> > > > > intel_rapl
> > > > > iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp
> > > > > mxm_wmi
> > > > > snd_hda_
> > > > > Oct 22 21:43:39 kvmhost kernel: sch_fq_codel fuse nfsd nfs
> > > > > auth_rpcgss
> > > > > oid_registry nfs_acl lockd grace sunrpc fscache ip_tables
> > > > > x_tables
> > > > > ext4
> > > > > crc16 mbcache jbd2 dm_mod hid_logitech_hidpp hid_logitech_dj
> > > > > hid_generic
> > > > > usbhid hid sd_mod uas usb_storage atkbd libps2 crc32c_intel
> > > > > ah
> > > > > Oct 22 21:43:39 kvmhost kernel: CPU: 4 PID: 0 Comm: swapper/4
> > > > > Tainted: G
> > > > > L 4.1.10-1-vfio-lts #1
> > > > > Oct 22 21:43:39 kvmhost kernel: Hardware name: To Be Filled
> > > > > By
> > > > > O.E.M. To
> > > > > Be Filled By O.E.M./Z77 Extreme4, BIOS P2.30 09/21/2012
> > > > > Oct 22 21:43:39 kvmhost kernel: task: ffff88080b119460 ti:
> > > > > ffff88080b124000 task.ti: ffff88080b124000
> > > > > Oct 22 21:43:39 kvmhost kernel: RIP:
> > > > > 0010:[<ffffffff810f6770>]
> > > > > [<ffffffff810f6770>] try_to_del_timer_sync+0x0/0xa0
> > > > > Oct 22 21:43:39 kvmhost kernel: RSP: 0018:ffff88082f303db0
> > > > > EFLAGS:
> > > > > 00000286
> > > > > Oct 22 21:43:39 kvmhost kernel: RAX: 00000000ffffffff RBX:
> > > > > 0000000000000286 RCX: 0000000000000000
> > > > > Oct 22 21:43:39 kvmhost kernel: RDX: 00000000000000bf RSI:
> > > > > 0000000000000286 RDI: ffff880270fa8428
> > > > > Oct 22 21:43:39 kvmhost kernel: RBP: ffff88082f303dc8 R08:
> > > > > 0000000000002710 R09: ffff88082f30e780
> > > > > Oct 22 21:43:39 kvmhost kernel: R10: 0000000000000000 R11:
> > > > > 0000000000000004 R12: ffff88082f303d28
> > > > > Oct 22 21:43:39 kvmhost kernel: R13: ffffffff815f13de R14:
> > > > > ffff88082f303dc8 R15: ffff880270fa8428
> > > > > Oct 22 21:43:39 kvmhost kernel: FS: 0000000000000000(0000)
> > > > > GS:ffff88082f300000(0000) knlGS:0000000000000000
> > > > > Oct 22 21:43:39 kvmhost kernel: CS: 0010 DS: 0000 ES: 0000
> > > > > CR0:
> > > > > 0000000080050033
> > > > > Oct 22 21:43:39 kvmhost kernel: CR2: 00007fc2d6f6da28 CR3:
> > > > > 000000029c65c000 CR4: 00000000001426e0
> > > > > Oct 22 21:43:39 kvmhost kernel: Stack:
> > > > > Oct 22 21:43:39 kvmhost kernel: ffffffff810f6872
> > > > > ffff88082f303e38
> > > > > ffff880270fa8390 ffff88082f303df8
> > > > > Oct 22 21:43:39 kvmhost kernel: ffffffff8152a16f
> > > > > ffff880270fa8390
> > > > > ffff8805b3bab800 ffff880270d20000
> > > > > Oct 22 21:43:39 kvmhost kernel: 0000000000000001
> > > > > ffff88082f303e38
> > > > > ffffffff8152a3e7 ffff88082f3107e0
> > > > > Oct 22 21:43:39 kvmhost kernel: Call Trace:
> > > > > Oct 22 21:43:39 kvmhost kernel: <IRQ>
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff810f6872>] ?
> > > > > del_timer_sync+0x62/0x70
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a16f>]
> > > > > inet_csk_reqsk_queue_drop+0xbf/0x240
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a3e7>]
> > > > > reqsk_timer_handler+0xf7/0x2e0
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a2f0>] ?
> > > > > inet_csk_reqsk_queue_drop+0x240/0x240
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff810f64c8>]
> > > > > call_timer_fn+0x48/0x160
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a2f0>] ?
> > > > > inet_csk_reqsk_queue_drop+0x240/0x240
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff810f6bd4>]
> > > > > run_timer_softirq+0x284/0x330
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff81086711>]
> > > > > __do_softirq+0xf1/0x2e0
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff81086acd>]
> > > > > irq_exit+0xbd/0xc0
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff815f31d5>]
> > > > > smp_apic_timer_interrupt+0x55/0x70
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff815f13de>]
> > > > > apic_timer_interrupt+0x6e/0x80
> > > > > Oct 22 21:43:39 kvmhost kernel: <EOI>
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff81021c1d>] ?
> > > > > native_sched_clock+0x2d/0xa0
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff81490c81>] ?
> > > > > cpuidle_enter_state+0xa1/0x250
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff81490c53>] ?
> > > > > cpuidle_enter_state+0x73/0x250
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff81490e8a>]
> > > > > cpuidle_enter+0x2a/0x30
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff810cb36c>]
> > > > > cpu_startup_entry+0x32c/0x460
> > > > > Oct 22 21:43:39 kvmhost kernel: [<ffffffff81055f7e>]
> > > > > start_secondary+0x19e/0x1e0
> > > > > Oct 22 21:43:39 kvmhost kernel: Code: 4d d8 65 48 33 0c 25 28
> > > > > 00
> > > > > 00
> > > > > 00
> > > > > 44 89 e0 75 0b 48 83 c4 18 5b 41 5c 41 5d 5d c3 e8 1b b8 f8
> > > > > ff 90
> > > > > 66 2e
> > > > > 0f 1f 84 00 00 00 00 00 <0f> 1f 44 00 00 55 48 89 e5 41 54 53
> > > > > 48
> > > > > 81
> > > > > ec
> > > > > 30 10 00 00 48 83
> > > > > 
> > > > > This happens for all cores and it locks up the entire system.
> > > > > I
> > > > > don't
> > > > > know what to do. On 4.2.3-1-vfio I have no hangups and all my
> > > > > non-vfio
> > > > > VMs work perfectly fine.
> > > > > 
> > > > > Thank you,
> > > > > Lucas Kückelhaus
> > > > > 
> > > > > _______________________________________________
> > > > > vfio-users mailing list
> > > > > vfio-users at redhat.com
> > > > > https://www.redhat.com/mailman/listinfo/vfio-users [2]
> > > > 
> > > > 
> > > > Links:
> > > > ------
> > > > [1] https://i.imgur.co
> > > > [2] https://www.redhat.com/mailman/listinfo/vfio-users
> > > 
> > > _______________________________________________
> > > vfio-users mailing list
> > > vfio-users at redhat.com
> > _______________________________________________
> > vfio-users mailing list
> > vfio-users at redhat.com
> > https://www.redhat.com/mailman/listinfo/vfio-users
> > _______________________________________________
> > vfio-users mailing list
> > vfio-users at redhat.com
> > https://www.redhat.com/mailman/listinfo/vfio-users
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20151023/92a80374/attachment.sig>


More information about the vfio-users mailing list