[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [vfio-users] 1 GB hugepages cause host crash on guest shutdown with some GPUs

On Mon, 2015-11-30 at 18:37 +0700, Okky Hendriansyah wrote:
> Hmm interesting, I found my setup to show weird white flickery things
> on the guest desktop when I use 1 GB hugepages. Did you notice these
> things on yours? Falling back to 2 MB hugepages bring things back to
> normal. 
Nope, everything in the VM seemed fine with both the old and new GPU
when using 1 GB hugepages.

> I didn’t experience the freezing/crashing on the hypervisor host like
> you describe though. How do you bind the GPU? Is it pci-stub or vfio-
> pci? I use vfio-pci and load it statically during boot.
I'm using vfio-pci module, set up with this line in a modprobe.d file.
In Arch, that gets copied into the initramfs so it applies before the
radeon module loads.

options vfio-pci disable_vga ids=1002:67b0,1002:aac0,1033:0194,8086:8d62

Discovered that switching to 2 MB hugepages doesn't completely solve
the problem.  I've still been getting the crashes on guest shutdown,
but usually only if it runs for more than a few minutes.

Just got an interesting one now.  Maybe I will try turning off NX

[84786.642703] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[84786.642724] BUG: unable to handle kernel paging request at ffff88087b32c430
[84786.642740] IP: [<ffff88087b32c430>] 0xffff88087b32c430
[84786.642754] PGD 1b2f067 PUD 87ed86063 PMD 6c9662063 PTE 800000087b32c163
[84786.642771] Oops: 0011 [#1] PREEMPT SMP 
[84786.642782] Modules linked in: vhost_net vhost macvtap macvlan tun dm_snapshot dm_bufio nfsv3 hid_sony ff_memless led_class rpcsec_gss_krb5 nfsv4 dns_resolver ebtable_filter ebtables fuse bridge stp llc nf_log_ipv4 nf_log_ipv6 msr nf_log_common xt_LOG xt_tcpudp xt_pkttype ip6t_rt nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 xt_recent xt_addrtype xt_conntrack ip6table_filter nf_conntrack ip6_tables iptable_filter nct6775 hwmon_vid joydev mousedev hid_generic usbhid hid snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi nls_iso8859_1 nls_cp437 vfat fat intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_intel crct10dif_pclmul crc32_pclmul snd_hda_controller aesni_intel snd_hda_codec iTCO_wdt iTCO_vendor_support aes_x86_64 lrw e1000e
[84786.642974]  snd_hda_core mxm_wmi snd_hwdep gf128mul glue_helper snd_pcm ablk_helper evdev mac_hid psmouse snd_timer cryptd pcspkr ptp serio_raw snd mei_me sb_edac pps_core mei soundcore edac_core i2c_i801 lpc_ich shpchp tpm_tis tpm wmi processor button sch_fq_codel nfsd nfs auth_rpcgss oid_registry nfs_acl lockd grace sunrpc fscache ip_tables x_tables ext4 crc16 mbcache jbd2 uas usb_storage dm_mod sr_mod cdrom sd_mod atkbd libps2 crc32c_intel ahci xhci_pci libahci xhci_hcd libata usbcore usb_common scsi_mod i8042 serio radeon i2c_algo_bit drm_kms_helper ttm drm i2c_core vfio_pci vfio_virqfd vfio_iommu_type1 vfio
[84786.643133] CPU: 2 PID: 1172 Comm: libvirtd Not tainted 4.1.13-1-vfio-lts #1
[84786.643148] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X99 Extreme4, BIOS P2.00 06/01/2015
[84786.643168] task: ffff8806a5c9dbb0 ti: ffff8806a5d68000 task.ti: ffff8806a5d68000
[84786.643183] RIP: 0010:[<ffff88087b32c430>]  [<ffff88087b32c430>] 0xffff88087b32c430
[84786.643201] RSP: 0018:ffff8806a5d6bc70  EFLAGS: 00010286
[84786.643212] RAX: ffff88087b32c430 RBX: ffff88087a9fd098 RCX: 0000000000000000
[84786.643226] RDX: 0000000000000000 RSI: ffff88087a9fd098 RDI: ffff88087a9fd098
[84786.643241] RBP: ffff8806a5d6bc98 R08: 0000000000000002 R09: ffff8806a5d6bc3c
[84786.643255] R10: 0000000000000001 R11: 0000000000000400 R12: ffff88087a9fd146
[84786.643269] R13: ffff88087b32c430 R14: 0000000000000000 R15: 000000000000000c
[84786.643284] FS:  00007f1a12a05800(0000) GS:ffff88087f280000(0000) knlGS:0000000000000000
[84786.643300] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[84786.643311] CR2: ffff88087b32c430 CR3: 00000006abe0f000 CR4: 00000000001406e0
[84786.643326] Stack:
[84786.643330]  ffffffff813ff486 00088806a5d6bca8 0000000000000048 ffff88087a9fd098
[84786.643348]  0000000000000004 ffff8806a5d6bcc8 ffffffff81400391 ffff88087a9fd098
[84786.643366]  0000000000000004 ffff88087a9fd146 0000000000000246 ffff8806a5d6bcf8
[84786.643384] Call Trace:
[84786.643391]  [<ffffffff813ff486>] ? __rpm_callback+0x36/0x90
[84786.643404]  [<ffffffff81400391>] rpm_idle+0x231/0x2a0
[84786.643415]  [<ffffffff81400453>] __pm_runtime_idle+0x53/0x70
[84786.643430]  [<ffffffff81312fe8>] pci_device_remove+0x78/0xc0
[84786.643444]  [<ffffffff813f5247>] __device_release_driver+0x87/0x120
[84786.643458]  [<ffffffff813f5303>] device_release_driver+0x23/0x30
[84786.643471]  [<ffffffff813f4105>] unbind_store+0x115/0x160
[84786.643483]  [<ffffffff813f31e5>] drv_attr_store+0x25/0x40
[84786.643496]  [<ffffffff8125beea>] sysfs_kf_write+0x3a/0x50
[84786.643509]  [<ffffffff8125b3e7>] kernfs_fop_write+0x127/0x180
[84786.643522]  [<ffffffff811e0b37>] __vfs_write+0x37/0x110
[84786.643534]  [<ffffffff811e3b98>] ? __sb_start_write+0x58/0x110
[84786.643549]  [<ffffffff81283ec3>] ? security_file_permission+0x23/0xa0
[84786.643562]  [<ffffffff811e1504>] vfs_write+0xa4/0x1c0
[84786.643574]  [<ffffffff811e2289>] SyS_write+0x59/0xd0
[84786.643587]  [<ffffffff8158da6e>] system_call_fastpath+0x12/0x71
[84786.643599] Code: 88 ff ff 00 c4 32 7b 08 88 ff ff 10 c4 32 7b 08 88 ff ff 10 c4 32 7b 08 88 ff ff 20 c4 32 7b 08 88 ff ff 20 c4 32 7b 08 88 ff ff <30> c4 32 7b 08 88 ff ff 30 c4 32 7b 08 88 ff ff 40 c4 32 7b 08 
[84786.643679] RIP  [<ffff88087b32c430>] 0xffff88087b32c430
[84786.643692]  RSP <ffff8806a5d6bc70>
[84786.643700] CR2: ffff88087b32c430
[84786.648659] ---[ end trace 26c695decbecd868 ]---

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]