[dm-devel] [PATCH 7/7] Hold all write bios when errors are handled
Mikulas Patocka
mpatocka at redhat.com
Wed Nov 25 13:19:19 UTC 2009
On Tue, 24 Nov 2009, malahal at us.ibm.com wrote:
> I need to look at the code again, but I thought any new writes to a
> failed region go to a surviving leg. In that case, we end up returning
> I/O's to the application after writing to a single leg.
Writes always go to all the legs, see do_write(). Anyway, dmeventd removes
the failed leg soon.
> > > Also, we do need to do the above work only if "primary" leg fails. We
> > > can continue to work just like the old code if "secondary" legs fail,
> > > right? Not sure if this is worth optimizing though, but I would like to
> > > see it implemented as it is just a few extra checks. We can have
> > > primary_failure field like log_failure field.
>
> > I thought about it too, but concluded that we need to hold bios even if
> > the primary leg fails.
> >
> > Imagine this scenario:
> > * secondary leg fails
> > * write fails on the secondaty leg and succeeds on the primary leg
> > and is successfully complete
> > * the computer crashes
> > * after a reboot, the primary leg is inaccessible and the secondary leg is
> > back online --- now raid1 would be returning stale data.
>
> The software can detect this case. We can fail this completely or use
> the data from the secondary that could be "stale" with help from admin.
> Let us call this method 1.
You can't detect it because the computer crashed *before* you write the
information that the secondary leg failed to the metadata.
So, after a reboot, you can't tell if any mirror leg failed some requests
before the crash.
> > If we hold the bios if the secondary leg fails (as the patch does), one of
> > these two scenarios happen:
> >
> > * secondary leg fails
> > * write succeeds on the primary leg and is held
> > * the computer crashes
> > * after a reboot, the primary leg is inaccessible and the secondary leg is
> > back online --- but we haven't completed the write, so the transaction
> > wasn't reported as committed
> >
> > or
> >
> > * secondary leg fails
> > * write succeeds on the primary leg and is held
> > * dmeventd removes the secondary leg and the write succeeds
> > * the computer crashes
> > * after a reboot, the primary leg is inaccessible, the secondary leg was
> > already removed by dmeventd, so the array is considered inaccessible. So
> > it doesn't work but at least it doesn't revert already committed
> > transaction.
>
> How is this latter case (it doesn't need a crash anyway)
> different/better from the case where we detect that 'primary' is missing
> and ask admin if he wants to use the data on the secondary or not. At
> least, the admin has a choice with "method 1" and this doesn't have that
> choice.
If you ask the admin always if primary leg failed and wait for his action,
you lose fault-tolerance --- the computer would wait until the admin does
an action.
The requirements are:
* if one of legs fail or log fails, you must automatically continue
without human intervention
* if both legs fail, you must shut it down and not pretend that something
was written when it wasn't (this would break durability requirement of
transactions).
Mikulas
> Thanks, Malahal.
>
> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>
More information about the dm-devel
mailing list