Re: [dm-devel] DM-RAID1 data corruption

Mikulas Patocka [mpatocka redhat com] wrote:
> because of a loose cable, overheating, insufficient power or so, and the 
> condition is repaired), raid1 sees set bit in the dirty bitmap and starts 
> copying data from disk 0 to disk 1.
> The result: write bio was ended as succes, but the data was lost. For 
> databases, this might have bad consequences - committed transactions being 
> forgotten.
> If the above scenario can't happen, pls. describe why.
IIRC, this is a known problem, always attributed to a "rare/small
window" of chance. :-(

> Delay all bios until the userspace code removes the failed mirror?

That is what the code does when a log device fails. We can use the same

> Or store the number of the default mirror in the log?

This is one way to do it but what about "corelog" mirrors?

Look at this patch

It essentially generates an uevet and waits for the user level code to
act on it and send a message to unblock it.

