[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] dealing with oom-killer....



On 02/09/09 11:33, Corey Kovacs wrote:
A colleague has a 5 node cluster with 4GB ram in each node. It's not
enough for the cluster and more ram is on the way. The problem though is
that until the ram arrives, there is risk of oom-killer (which he found
out the other day) firing up and putting the node into a state which
made it utterly useless but still looked good to the cluster. We could
of course disable oom-killer but that's a workaround, not a fix.

I am wondering if the cluster responding to oom-killer firing up and
fencing the offending node is possible and if so, how others might have
done it. Seems like it should just be handled by the cluster tho. Maybe
have cman put a message across the openais "bus" like, "Hey, losing my
brain here, someone whak me"...


I suppose you could give cman a large value for /proc/<pid>/oom_score so that it is the first thing to be killed if the system runs out of memory. That should guarantee that it will be fenced by the other nodes ... provided they have enough memory to remain quorate!

Chrissie


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]