kmod-nvidia not working in kernel-2.6.17-1.2139_FC5

Scott R. Godin scott.g at mhg2.com
Wed Jul 5 16:13:27 UTC 2006


On Wed, 2006-07-05 at 08:16 -0700, Lonni J Friedman wrote:
> On 7/5/06, Scott R. Godin <scott.g at mhg2.com> wrote:
> > Nowhere in this discussion have I seen "nvidia attempting to help" -- in
> > point of fact, in the experience of others, which I have witnessed
> > second and third hand, user bug reports get largely ignored in the
> > general scheme of things. It's only the large corporate customers whose
> > bugs get fixed in anything resembling a timely fashion. I'd love to see
> > that change, but, that'll happen right around the time it goes
> > open-source. :-P
> 
> Surely you're joking.  I've submitted many bugs to Redhat's bugzilla
> for software that ships in their releases (both RHEL & FC), and very
> few of them ever see any attention.  Redhat bows before their large
> corporate customers just like NVIDIA.  You're deceiving yourself if
> you think otherwise.

You'll note most carefully that the only attention my bug submission to
livna's bugzilla recieved was to state that they don't follow up with
vendors of proprietary packages. Likewise similarly this would apply if
I filed it with Red Hat, most particularly since *it is not their
package*.

However security related bugs and urgent showstopping bugs with packages
they DO ship with the product do recieve attention including that of the
package's upstream developers. (if you read any of the changelogs in
stuff downloaded from yum update, you'll notice the word 'upstream'
appearing quite often.)

When I file a bug report with bugzilla I don't expect an immediate
response. When they get to it is fine with me, as long as they do get to
it, which has been the case with every other one I've ever filed.

lftp download.fedora.redhat.com:/pub/fedora/linux/core/updates/5/SRPMS>
ls |wc -l
381

as you can see, they aren't twiddling their thumbs either, considering
they have more than one release to backport bugfixes for. 

> > It's only the REST of the system that suffered because of it.
> >   o Horribly corrupted rpm databases? Huh?
> >   o Trashed swap partition (how the HELL did that happen?)
> >   o Perms on /etc/rc.d/ and /etc/rc.d/init.d/ suddenly being 0644
> > instead of 0755 (explain that one, if you can) ..
> >   o memory errors on DRAM that's passed memtest86 running all night (at
> > least 10 full test-suites, if not more), 100% cleanly?
> 
> What proof do you have that the nvidia X driver caused this?  Thus far
> your only response was pointing to Mike Harris' personal FUD campaign
> against binary drivers.

It seems to have escaped your notice that *I* did not post the link to
Mike's discussion (I fail to see how you managed to read a 'personal FUD
campaign' into what seems to me to be a fairly reasoned and rational
discussion post).

But anyway, the answer to your other question is simply this, 

I continually had problems through no less than FIVE COMPLETE
RE-INSTALLS of Fedora Core 5 UNTIL I stopped including kmod_nvidia as
part of that process about a month ago, SOLELY after speaking with a few
people online on irc.freenode.net (one of whom happened to be mike,
along with some other redhat-employed folks I've known for a few years)
about the kernel errors and what I needed to do to fix the drive
problems (rpm -Va while I was diagnosing something completely different
evidently caused by the memory corruption I was experiencing (the file
perms thing among others, shown above), died suddenly, generating the
results shown in pastebin linked below, as you can see: look for 'rpmv')
via fsck and distinctly recall being asked if I had any non-redhat
packages installed. As it turned out I have two: my wireless ethernet
drivers, and kmod_nvidia.

After the arduous triple manual fsck I removed kmod_nvidia and
re-installed some of the rpms connected to the damaged files on / with
the same packages as before.

the ONLY thing different about the current system from the five previous
installs is the absense of kmod_nvidia.

Draw your own conclusions. I used a big fat black sharpie to draw mine.
(on the back of 27 8x10 color glossy photos...)[1]

> > All sorts of 'general weirdness' that crept in gradually over _months_,
> > each re-install eventually resulting in different problems (some of
> > which I've casually grouped together above, but none of these occurred
> > during the same install. Each install was the eventual result of one of
> > the above (plus a few others) after I noticed it and did my careful best
> > to correct and preserve a system that I use on a daily basis.)
> >
> > If you saw this:
> >
> >     http://phpfi.com/125284
> >
> > Would the video drivers be the first place (or the second? the fourth?
> > top ten?) you looked for the culprit? No? funny thing, *neither would
> > I*... nevertheless nvidia was indeed the source of this and other
> > strange problems. And that's the *only* time out of the five that I got
> > anything conclusive recorded as far as error messages that indicated
> > *something* was horribly wrong (_before_ things *went* completely
> > wahooni-shaped forcing a reinstall), and led me to a real, practical,
> > restored to full functionality, solution.
> 
> Again, what proof do you have that the nvidia X driver caused that
> crash?  As you, yourself, noted, the nvidia kernel module is no where
> to be found in that backtrace.

Yes I know. Strange, is it not? memtest86 ran _overnight_ showed nothing
wrong with the memory sticks. Yet uninstalling kmod_nvidia has proven
the solution solely on the basis of the fact that my system is no longer
gradually going wahooni-shaped. Again, you are free to draw your own
conclusions. I am doing nothing different with this system today than I
was doing months ago except more of it. (more local-only vhosts for
preflight website testing, more files in /home, more yum updates
downloaded, etc.)

> > So please, spare me. If it works for you, great. I'm happy for you.
> > _Proceed with caution_. That road, however well-traveled, is not
> > well-paved.
> 
> At this point, you sound like a troll with an axe to grind.

At this point I sound *exactly* like a user thoroughly frustrated with
having to re-install an otherwise utterly stable system _multiple times
over the past year_ (which costs me money in downtime as I cannot work
while the system's fubered) due to untraceable mysterious failures in
the system -- NONE of which showed any connection to the errors occuring
during the NEXT complete wipe-and-install (and the next. and the next.
and the next. shall I go on?), who is overjoyed at having found out what
the problem REALLY was, and mad-as-hell-not-going-to-take-it-anymore[2]
with Nvidia for making it difficult and unrewarding *for anyone
including the package maintainers* (unless you're waving truckloads of
money at them) to talk to them about bugs in their driver.

Coincidence? Highly doubtful.

This tale of woe also happens to include different versions of
kmod_nvidia as Nvidia released a new version of their driver somewhere
in between when I first upgraded from FC4 to FC5 and the present. 

This is a cautionary tale, nothing more. Do whatever you want, but you
cannot say I didn't try to warn you. I sincerely hope it works better
for others than it did for me, but the more my uptime continues to
improve, the happier I am and the more convinced I become that I made
the right choice.

The ONLY way you'll convince me that it wasn't kmod_nvidia is if it
happens again, and I find that _even more_ unlikely than you seeing any
connection between that pastebin and kmod_nvidia. :D

*** 
[1] Cheers to anyone who groks the reference.
[2] Or this one. Great movie.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/fedora-list/attachments/20060705/01c9f900/attachment-0001.sig>


More information about the fedora-list mailing list