[dm-devel] [PATCH 7/7] Hold all write bios when errors are handled

Tue Nov 24 19:17:04 UTC 2009

Mikulas Patocka [mpatocka at redhat.com] wrote:
> Yes, writes after the failed request are processed, but it is not a 
> problem --- if the write succeeded on all legs, it is returned as success 
> --- in this case, resychronization can't corrupt written data. If the 
> write succeeded only on some legs, it is held again.
> 
> So in practice, if some leg fails completely, all writes will be held.

I need to look at the code again, but I thought any new writes to a
failed region go to a surviving leg. In that case, we end up returning
I/O's to the application after writing to a single leg.

> > Also, we do need to do the above work only if "primary" leg fails. We
> > can continue to work just like the old code if "secondary" legs fail,
> > right? Not sure if this is worth optimizing though, but I would like to
> > see it implemented as it is just a few extra checks. We can have
> > primary_failure field like log_failure field.

> I thought about it too, but concluded that we need to hold bios even if 
> the primary leg fails.
> 
> Imagine this scenario:
> * secondary leg fails
> * write fails on the secondaty leg and succeeds on the primary leg 
> and is successfully complete
> * the computer crashes
> * after a reboot, the primary leg is inaccessible and the secondary leg is 
> back online --- now raid1 would be returning stale data.

The software can detect this case. We can fail this completely or use
the data from the secondary that could be "stale" with help from admin. 
Let us call this method 1.

> If we hold the bios if the secondary leg fails (as the patch does), one of 
> these two scenarios happen:
> 
> * secondary leg fails
> * write succeeds on the primary leg and is held
> * the computer crashes
> * after a reboot, the primary leg is inaccessible and the secondary leg is
> back online --- but we haven't completed the write, so the transaction 
> wasn't reported as committed
> 
> or
> 
> * secondary leg fails
> * write succeeds on the primary leg and is held
> * dmeventd removes the secondary leg and the write succeeds
> * the computer crashes
> * after a reboot, the primary leg is inaccessible, the secondary leg was 
> already removed by dmeventd, so the array is considered inaccessible. So 
> it doesn't work but at least it doesn't revert already committed 
> transaction.

How is this latter case (it doesn't need a crash anyway)
different/better from the case where we detect that 'primary' is missing
and ask admin if he wants to use the data on the secondary or not. At
least, the admin has a choice with "method 1" and this doesn't have that
choice.

Thanks, Malahal.