[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot



Consider that the hung/failed node was in the middle of a write to the SAN and froze. Now imagine at some point in the future it recovers, having no idea that time passed it has no reason to doubt that it's locks are still valid so it just finishes the writes. Congrats, you could have just corrupted your storage.

UMMMMMM

I use ext3(LV)->(VG=exclusive=true with clvmd)->(pv)->(multipath)->(SAN), so as you know the redhat cluster only support failover resource, so your example is not very clear, how can i corrupte the storare with clean_start=1?


2013/9/12 Digimer <lists alteeve ca>
The problem that Pascal has is that the node sees the peer, joins and fences anyway. So in this case, clean_start won't help.

Even with a SAN/qdisk though, it's not needed to enable this. If the remaining node can't talk to qdisk, it won't have quorum and will not be offering services, so fencing it won't hurt. It's *always* better to put nodes into a known state, regardless of quorum.

Consider that the hung/failed node was in the middle of a write to the SAN and froze. Now imagine at some point in the future it recovers, having no idea that time passed it has no reason to doubt that it's locks are still valid so it just finishes the writes. Congrats, you could have just corrupted your storage.

_Never_ assume _anything_.

"The only thing you don't know is what you don't know."

digimer

On 11/09/13 18:24, emmanuel segura wrote:
Fixed previous mail

clean_start=1 disable the startup fencing and if you use a quorum disk
in your cluster without expected_votes=1, when a node start after it has
been fenced, the node dosn't try to fence di remain node and doesn't try
to start the service, because rgmanager need a cluster quorate, so many
people around say clean_start=1 is dangerous, but no one give a clear
reason, in my production cluster a i have clvm+vg in exclusive
mode+(clean_start=1)+(master_
wins). so if you can explain me where is the problem :) i apriciate



2013/9/11 Digimer <lists alteeve ca <mailto:lists alteeve ca>>

    On 11/09/13 12:04, emmanuel segura wrote:

        Hello Pascal

        For disable startup fencing you need clean_start=1 in the
        fence_daemon
        tag, i saw in your previous mail you are using
        expected_votes="1", with
        this setting every cluster node will be partitioned into two
        clusters
        and operate independently, i recommended using a quorim disk with
        master_wins parameter


    This is a very bad idea and is asking for a split-brain, the main
    reason fencing exists at all.


    --
    Digimer
    Papers and Projects: https://alteeve.ca/w/
    What if the cure for cancer is trapped in the mind of a person
    without access to education?




--
esta es mi vida e me la vivo hasta que dios quiera


--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without access to education?



--
esta es mi vida e me la vivo hasta que dios quiera

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]