[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] 2.6.2-udm2



On 2004-02-18T13:01:29,
   Heinz Mauelshagen <mauelshagen redhat com> said:

> > - The test interval is no longer passed in with the target
> > parameters (tools writers take note.
> > 
> > - The kernel now does no path testing at all, let userland do it
> > (see thread on lkml).
> 
> I'm not convinced that we can recover from OOM situations with either
> approach (userpace testing or dm testing of failed paths).

I'm thinking that if you have run yourself into such a failure - ie all
paths currently down, swap on the failed m-p device, OOM _and_ needing
to allocate memory / swapping - the system is in a very very sick state
anyway. Handling it perfectly may just not be possible.

> But if *all* paths of the multipath target to test are failed *and*
> the system is OOM, the driver accessed to queue the test io can sleep
> on allocating memory (either calling [kv]malloc() directly or
> indirectly through mempools).
> 
> That memory allocation is in danger to deadlock, because pageouts are
> needed involving the very multipathed target we want to unfail.
> 
> The 'workaround' for this which is reloading the table in order to set
> all paths to operational again would involve memory allocation as well
> :(

Yes. I actually see no way around this, except to tap into a general
'emergency' memory pool. I thought there was something like it, but I
forgot the name ;-) Doesn't pvmove use the same?


Sincerely,
    Lars Marowsky-Brée <lmb suse de>

-- 
High Availability & Clustering	      \ ever tried. ever failed. no matter.
SUSE Labs			      | try again. fail again. fail better.
Research & Development, SUSE LINUX AG \ 	-- Samuel Beckett



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]