[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] 2.6.2-udm2

On Tue, Feb 17, 2004 at 05:33:22PM +0000, Joe Thornber wrote:
> http://people.sistina.com/~thornber/dm/patches/2.6-unstable/2.6.2/2.6.2-udm2.tar.bz2
> Big changes to the multipath target.
> - The test interval is no longer passed in with the target parameters
>   (tools writers take note.
> - The kernel now does no path testing at all, let userland do it (see thread on lkml).

I'm not convinced that we can recover from OOM situations with either
approach (userpace testing or dm testing of failed paths).

We can and will need to mlockall() the path test tool, which will make sure
that we can access all pages of the tool, shared library or mmap'ed file it

But if *all* paths of the multipath target to test are failed *and* the
system is OOM, the driver accessed to queue the test io can sleep on
allocating memory (either calling [kv]malloc() directly or indirectly
through mempools).

That memory allocation is in danger to deadlock, because pageouts are needed
involving the very multipathed target we want to unfail.

The 'workaround' for this which is reloading the table in order to set all
paths to operational again would involve memory allocation as well :(

Tell me what I am missing if this is no issue ?

Heinz    -- The LVM Guy --

> - Switch to work queues rather than dm-daemon (see other thread on lkml).
> Changes since 2.6.2-udm1
> ------------------------
> Revision 25:
>   Audit for list_for_each_*entry*
> Revision 26:
>   Zero m->current_path if it has failed, so that we choose a new one
>   more quickly.
> Revision 27:
>   The snapshot destructor shouldn't be triggering an event, this code
>   must date from before the reference counting of tables was put in.
> Revision 28:
>   Fix a dm-crypt compile warning on x86-64 with gcc 3.3.1.
>   [Andreas Steinmetz]
> Revision 29:
>   Fix IV generation type returned in dm-crypt status method.
>   [Christophe Saout]
> Revision 30:
>   Fill in missing queue limitations when table is complete instead of
>   enforcing the "default" limits on every dm device.
>   Problem noticed by Mike Christie.
>   [Christophe Saout]
> Revision 31:
>   Change _mpath_lock from a spinlock to a semaphore.  [Kevin Corry]
> Revision 32:
>   Can't call dm_table_event() while holding a spinlock with interrupts
>   disabled.  [Kevin Corry]
> Revision 33:
>   Strip out path testing, we'll do this from userland instead.
> Revision 34:
>   Use work_queues rather than dm-daemon.  [Mike Christie/Joe Thornber]
> Revision 35:
>   Remove obvious comment.
> Revision 36:
>   Comment MPATH_MIN_IO.
> Revision 37:
>   Work-queue names must be less than 10 characters.
> --
> dm-devel mailing list
> dm-devel redhat com
> https://www.redhat.com/mailman/listinfo/dm-devel

*** Software bugs are stupid.
    Nevertheless it needs not so stupid people to solve them ***


Heinz Mauelshagen                                 Red Hat, Inc.
Consulting Development Engineer                   Am Sonnenhang 11
                                                  56242 Marienrachdorf
Mauelshagen RedHat com                            +49 2626 141200
                                                       FAX 924446

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]