[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] DM-RAID1 data corruption

malahal us ibm com wrote:
> Takahiro Yasui [tyasui redhat com] wrote:
>> malahal us ibm com wrote:
>>> Look at this patch
>>> http://permalink.gmane.org/gmane.linux.kernel.device-mapper.devel/4973
>>> It essentially generates an uevet and waits for the user level code to
>>> act on it and send a message to unblock it.
>> This patch was posted more then a year ago, and I could not find
>> any discussion on this issue/patch in the mailing list archive.
>> What was the conclusion of the discussion about this patch?
>> Are there any discussions outside this mailing list?
> The patch alone can't fix the issue. It needed LVM changes. We had some
> discussions on how to implement the LVM related changes. Finally I was
> told look at remote-replication target code to see how that handles
> selecting the right "MASTER" device. That code is not published yet.

Who is working on this?

> That is how the "log device" failure is handled today. Alasdair also
> thought we needed to change LVM to handle events as soon as possible
> using a single thread and not block behind an LVM scan, etc.

I agree. I also described this point in the background section of
"Introduce metadata cache".

> Another method is to have dm-mirror target metadata on the disk itself.
> This metadata is internal to the kernel module and would NOT touch it.
> This would avoid any user level interaction and delays.

I'm interested in this approach that dm-mirror manages own data
to keep the status, such as the number of default mirror, valid
legs. When an error is detected, dm-mirror handles the error and
disable the error disk as soon as possible in kernel space, then
lvm metadata is managed in the user-space later.

Some transaction systems are sensitive to delay, and approaches
which don't cause much delay even if an error was detected are

> Of course, we can do something in the log itself but it will not fix
> "corelog" mirrors, more over the system can't auto recover after a
> missing log alone.

Yes, storing information on the log device does not save "corelog"
mirrors, so we might need some area to keep information on mirror


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]