Re: RAID 1 Mismatches

On Mon, 28 Dec 2009 19:41:53 -0800
Rick Wagner <rjwgnr27 verizon net> wrote:

> I have three disks in my system, divided into a mix of RAID-1 and
> RAID-5 partitions.  I have /boot as a RAID-1 on MD0 (SDA1, SDB1), /
> as RAID-1 on MD1 (SDA2, SDB2), and the remainder as RAID-5 with LVM
> (SDA3, SDB3, SDC3).  The intent is that for boot and root, if SDA
> fails, I can boot off SDB.
> Every week I get an "Anacron job 'cron.weekly'" e-mail telling of:
> "WARNING: mismatch_cnt is not 0 on /dev/md1".  If I look at 
> /sys/block/md1/md/mismatch_cnt, I will see a number, usually 128,
> sometimes 64.  If I manually invoke a scan, I will see the same
> number.  Note that neither MD0 or MD3 report any errors.
> If I set SDB1 bad, then remove then re-add it, it will rebuild fine,
> and a scan shows no errors.  Scanning the messages (current and
> historical), I see no reports of medium errors reported for any of
> SD[ABC].  
> I have several questions:
> 1) Where can the errors be coming from?  I would understand if a
> drive were reporting errors.  Could it be during boot, one of the R-1
> members is being written too before MD is started?

This warning means that there are blocks that are not identical between
your raid drives, in free space. 

> 2) Sans drive errors messages, how to determine which drive is out of
> sync.  I have resynced SDA2 to SDB2, but that is basically a coin
> flip; if SDB2 were correct, then I may have damaged files on SDA2.
> How can I determine where the mismatches occur, and then determine
> the file(s) potentially affected?

No need to. Just ask it to run a repair. 

echo repair >/sys/block/md<#>/md/sync_action

then another check:

echo check >/sys/block/md<#>/md/sync_action


