[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
[dm-devel] Re: [PATCH 3/3] Add timeout feature
- From: jim owens <jowens hp com>
- To: Takashi Sato <t-sato yk jp nec com>
- Cc: axboe kernel dk, Theodore Tso <tytso mit edu>, mtk manpages googlemail com, Miklos Szeredi <miklos szeredi hu>, Dave Chinner <david fromorbit com>, linux-kernel vger kernel org, xfs oss sgi com, hch infradead org, dm-devel redhat com, viro ZenIV linux org uk, linux-fsdevel vger kernel org, akpm linux-foundation org, linux-ext4 vger kernel org, Arjan van de Ven <arjan infradead org>, pavel suse cz
- Subject: [dm-devel] Re: [PATCH 3/3] Add timeout feature
- Date: Mon, 14 Jul 2008 10:04:30 -0400
Takashi Sato wrote:
What is the difference between the timeout and AUTO-THAW?
When the kernel detects a deadlock, does it occur to solve it?
TIMEOUT is a user-specified limit for the freeze. It is
not a deadlock preventer or deadlock breaker. The reason
it exists is:
- middle of the night (low but not zero users)
- cron triggers freeze and hardware snapshot
- san is overloaded by tape copy traffic so
hardware will take 2 hours to ack snapshot done
- user "company president" tries to create a report
needed for an AM meeting with bankers
- with so few users, system will just patiently
wait for hardware to finish
- after 10 minutes "company president" pages
admin, admin's boss, and "IT vice president"
in a real unhappy mood
AUTO-THAW is simply a name for the effect of all deadlock
preventer and deadlock breaker code that the kernel has
in the freeze implementation paths... if that code would
unfreeze the filesystem. We also implemented deadlock
preventer code that does not thaw the freeze.
None of the AUTO-THAW code is there to stop a stupid
userspace program caller of freeze. It handles things
like "a system in our cluster is going down so we
must have this filesystem unfrozen or the whole
cluster will crash". In places where there could be
a kernel deadlock we made it "lock-only-if-non-blocking"
and if we could not wait to retry later, the failure
to lock would trigger an immediate unfreeze.
Deadlock prevention needs code in critical paths in more
than just filesystems. Sometimes this is as simple as
an "I can't wait on freeze" flag added to a vm-filesystem
interface.
Timers just don't work for keeping the kernel alive
because they don't trigger on resource exhaustion.
jim
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]