dri xscreensaver/kernel lockups
alan
alan at clueserver.org
Thu Oct 9 10:32:45 UTC 2003
I have seen something that looks like a lockup, but is not. (The system
still works, but X is stuck in an alarm wait loop.)
The telling difference is that you can still ssh into the box and strace
X. X will be pegging at 99+%. Of course, if you only have one CPU, it
may seem like you are dead.
Want a copy of the strace for exact messages?
On Thu, 9 Oct 2003, Mike A. Harris wrote:
> On Wed, 1 Oct 2003, Pekka Savola wrote:
>
> >A huge number of (probably DRI -related) xscreensaver lockups
> >(especially with screensavers with a lot of 3D effects) have
> >been reported in the past; check a list e.g. from:
> >
> >https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=84214
> >
> >(I tried to collect a few I found there).
> >
> >1) have folks tested whether these still persist in Fedora Core as well?
> >
> >2) is there any hope of getting these bugs fixed?
> > b) if one wants a workaround, just simply removing "too flashy"
> >screensavers might help .. but then the bugs would not get fixed ;-)
>
> More likely than not, all reported issues still exist. The easy
> workaround is to either disable DRI, or to disable the
> screensavers that trigger the lockup. This allows one to use
> their computer for day to day work, etc. without experiencing the
> lockup issues.
>
> In some cases, simply disabling the screensavers triggering the
> issue will allow the user to use other 3D applications without
> problem, while other cases the problem is deeper and affects all
> 3D applications including non-screensavers.
>
> Debugging 3D acceleration related problems in the DRI is one of
> the more complex and time consuming of all problems that get
> reported against XFree86. Problems can take anywhere from a day
> to a week or two if not more in order to diagnose. Since it is
> often easy to workaround the problem by disabling a screensaver
> or whatnot, such problems aren't of the highest priority, in
> particular when most problems with DRI are reported against
> screensavers and/or video games.
>
> Currently, there are only 2 developers here at Red Hat working on
> XFree86 - myself, and John Dennis. John's work has mostly
> focused on handling XFree86 related problems on ia64 and other
> architectures for our Red Hat Enterprise Linux development so
> far, however we hope to work more closely in the future with
> development/troubleshooting/debugging/etc.
>
> With only 2 developers, and the plethora of incoming bug reports
> that do get reported, it's not possible for every single bug
> report to get immediate attention and guaranteed fixes due to the
> need to prioritize work. Also, XFree86 isn't the only work we're
> responsible for, so we need to balance the various duties we're
> responsible for in order to accomplish the most amount of work in
> the least amount of time, and to fix the most number of problems
> that affect the most number of people.
>
> When it comes to bugs in the DRI and the kernel portion of it
> (the DRM), spending an unknown amount of time debugging an issue
> which may require up to a week or 3 and may end up still being
> unable to determine the cause of a given problem - just to fix a
> hangup caused by a screensaver that can be easily disabled, or to
> a bug triggered by Quake 3, Unreal Tournament 2003 or somesuch,
> just isn't the best usage of Red Hat engineering resources. As
> such, these kinds of problems tend to be much lower priority.
>
> Also, for the video game issues at least, I often do not have the
> video game which triggers the question, so am unable to
> investigate the issue without forking out $70 or more out of my
> own pocket, although I do have RTCW, and a few others...
>
> Often, the problems are bugs in the kernel DRM code, and
> sometimes get fixed magically when the DRM gets updated and a new
> kernel pushed out in an update. Other times the problems are
> motherboard or video card specific, and we simply don't have the
> hardware. Or it could be the specific combination of video card
> and motherboard chipset, operating with a specific AGP speed.
> The number of variables one can chase around are staggering
> potentially, and a lot of time can be spent.
>
> In short, DRI/DRM related bugs found in X, should always be
> reported directly to the DRI project, as there are 10+ developers
> who more or less do nothing but develop DRI all day long, and
> troubleshoot these types of issues. It's much more likely to be
> fixed by them, and patches that come out of it possibly included
> in our future kernel and/or XFree86 updates, than it is likely to
> get 2 or 3 weeks of my or John's very limited time. However,
> even if someone does report this upstream to the DRI project
> (preferably by way of http://bugs.xfree86.org), it's always a
> good idea to keep reporting the bugs in Red Hat bugzilla also, in
> order for me to be aware of the problem, and be able to track it
> upstream, or to have some fun poking at video games on the
> weekend to try and fix it in my own personal time.
>
> I know these types of responses are not the kind that users
> expect or want to hear from developers, however I like to be
> straightforward and honest about these types of things.
> Developer resources are a very limited thing, and we need to
> spend our Red Hat engineering time on things that give the
> biggest bang and affect the most users out there, and bugs
> affecting business usage generally get preference over "toy" bugs
> such as video games and screensavers (although what I do in my
> spare time on weekends can definitely be quite the opposite
> <grin>).
>
> Also, the more a person is willing to crack out a debugger, or to
> sprinkle printk()'s in the DRM source code, and hop on #dri-devel
> on irc.freenode.net and stay in the channel for a week or more
> along with developers, the more us foolish developers can be
> sucked into devoting some of our personal time to such problems
> too. Not that it's easy or anything, but some determined folk
> have showed up, and hounded us until they were able to fix a
> problem themselves, with no prior XFree86 or kernel programming
> knowledge, so it is possible. ;o)
>
> I can be found usually on #dri-devel, #xfree86, #xwin,
> #freedesktop and other channels on irc.freenode.net, however I'm
> not always there when I'm there... If someone comes hunting me
> down, you're best off staying in the channel and idling until
> myself or others can help with debugging/etc.
>
> Hope this helps explain things, and at the same time give people
> some perspective, and some insight into the best way to go about
> seeking solutions to these types of problems in a way that can
> yield faster fixes and whatnot.
>
> TTYL
>
>
More information about the fedora-test-list
mailing list