[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] qdisk WITHOUT fencing



On 06/18/2010 07:57 AM, Jankowski, Chris wrote:

Using the analogy you gave, the problem with a mafioso is that he cannot kill
all other mafiosos in the gang when they are all sitting in solitary confinment
cells (:-)).

Do you have a better idea? How do you propose to ensure that there is no resource clash when a node becomes intermittent or half-dead? How do you prevent it's interference from bringing down the service? What do you propose? More importantly, how would you propose to handle this when ensuring consistency is of paramount importance, e.g. when using a cluster file system?

I would like to remark that this STONITH business causes endless
problems in clusters within a single data centre too. For example a
temporary hiccup on the network that causes short heartbeat failure
triggers all nodes of the cluster to kill the other nodes. And boy,
do they succeed with a typical HP iLO fencing. You can see all your
nodes going down. Then they come back and the shootout continues
essentially indefinitely if fencing works. If not, then they all
block.

If your network is that intermittent, you have bigger problems.
But you can adjust your cman timeout values (<totem token = "[timeout in milliseconds]"/>) to something more appropriate to the quality of your network.

And all of that is so unnecessary, as a combination of a properly
implemented quorum  disk and SCSI reservations with local boot disks
and data disks on shared storage  could provide quorum maintenance,
split-brain avoidance and protection of the integrity  of the
filesystem.

I disagree. If a note starts to go wrong, it cannot be trusted to not trash the file system, ignoring quorums and suchlike. Data integrity is too important to take that risk.

DEC ASE cluster on Ultrix and MIPS hardware had that in 1991. You do
not  even need GFS2, although it is very nice to have a real cluster
filesystem.

If you want something that's looser than a proper cluster FS without the need for fencing (and are happy to live with the fact that when splitbrain occurs, one of the files will win and the other copies _will_ get trashed, you may want to look into GlusterFS if you haven't already.

By the way, I believe that commercial stretched cluster on Linux is
not possible if you rely on LVM for distributed storage. Linux LVM
is architecturally incapable of providing any resilience over
distance, IMHO. It is missing the plex and subdisk layers as in
Veritas LVM and has no notion of location, so you it cannot tell
which piece of storage is in which data centre. The only volume
manager that I know that has this feature is in OpenVMS.  Perhaps
the latest Veritas has it too.

I never actually found a purpose for LVM that cannot be done away with if you apply a modicum of forward planning (something that seems to be becoming quite rare in most industries these days). There are generally better ways than LVM to achieve the things that LVM is supposed to do.

One could use distributed storage arrays of the type of HP P4000
(bought with Left Hand Networks). This shifts the problem from the
OS to the storage vendor.

What distributed storage would you use in a hypothetical stretched
cluster?

Depends on what exactly your use-case is. In most use-cases, properly distributed storage (a-la CleverSafe) comes with too much of a performance penalty to be useful when geographically dispersed. The single most defining measure of performance of a system is access time latencies. When caching gets difficult and your ping times move from LAN (slow) to WAN (ridiculous), performance generally becomes completely unworkable.

Gordan

Gordan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]