[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: F10+dmraid eats puppies! (and ate my system too)



Graham TerMarsch wrote:
I ran into this earlier in the week and after finally getting my machine back online am surprised to see that people aren't making a big stink about this... its got subtle nuances that make it nearly impossible to fix without loss of data.

I've found the following threads/bugs that appear related:

 https://bugzilla.redhat.com/show_bug.cgi?id=474697
 http://forums.fedoraforum.org/showthread.php?t=206206
 http://forums.fedoraforum.org/showthread.php?t=206284

Here's what happened to me...

I upgraded from F9 to F10 back on Nov 29th, and things seemed fine. I upgraded the kernel last Wednesday, rebooted, and started seeing all sorts of crazy weirdness. At first the system wouldn't boot at all, dying on errors of "killing init" and "corrupted libraries". I thought it sounded like FS corruption, so I booted the rescue CD, ran fsck (which came back clean), and then proceeded to re-install some of the packages with the corrupted libraries, so I could at least get the machine up and running again.

After several cycles of "rescue CD, install packages, reboot, fail", I decided that even if I could get it running I wasn't going to trust it. Went back to the rescue CD, and started backing up files onto other machines on the network here.

I then re-installed the machine, leaving my "/home" and "/usr/local" partitions as they were; reformatted everything else, but left those alone. Got the system up, but was then presented with the most shocking thing... it looked like my machine had basically done time-travel and was now *exactly* as it was on November 29th. Files I know I'd edited were missing changes, e-
mails were lost, databases were missing data.

Took me a while to figure it out, but here's what happened...

When I upgraded from F9 to F10, Anaconda detected my nvidia dmraid mirror and installed F10 onto both halves of the mirror. When I rebooted, though, it only picked up *ONE HALF* of the mirror... /dev/sda. It had the UUIDs right, but it didn't mount /device/mapper/nvidia_xxxx but mounted sda instead. When I did the kernel upgrade this week, *that* mounted sdb. When I reinstalled, it *also* mounted sdb, not sda or dmraid.

When I looked at sda directly, I saw all of my recent changes to files that I'd made since the 29th. When I looked at sdb directly, it was a snapshot of what my machine looked like on the 29th.

When we actually manage to get the bug fixed that caused this, anyone who's had this problem is potentially going to be in for a bigger world of hurt when applying the fix... I don't even think we can (with confidence) just nuke one half of the mirror and rebuild based on whats on the other half; how do we know which half they've been using? In my case, I'd made ~2wks of changes to sda not knowing that I was only using half the mirror, and then after updating the kernel got bumped over to sdb and made changes there while trying to fix it. Neither one was a mirror of the other, and each one had something on it that needed to be preserved. YUCK.

Once I realized what'd happened to my machine I went into the BIOS and turned off the nvidia fakeraid and re-installed directly onto the two drives. Isn't what I want as I'd at least like to have _some_ mirror of my data somewhere, but it was the only way I could get this machine running again.

Be forewarned.... F10+dmraid is *DANGEROUS* right now...

My perception is that using mdadm is a more reliable technology at the moment.

--
Bill Davidsen <davidsen tmr com>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]