[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Bad day in writesville

I have a test GFS cluster with two nodes using the CVS cluster code
from yesterday (Dec. 20th).  The nodes are using CMAN/DLM/CLVM.

Experimentation has found that heavy write activity on one node causes
that node to come to a screeching halt.  It stops sending heartbeats, so
the other node fences it.  I understand that and it's the way I'd expect
the _cluster_ to behave, but I'm sure unhappy that the first node
crashes in the first place.

The first crash occured when we decided to replicate content from a NAS
to the GFS-mounted filesystem via an rsync job.  After rebooting the
failed node and such, I decided to wipe the data off the GFS volume via
an "rm -rf *".  It crashed in the same manner.

An IRC conversation with one of the developers suggested that I try GULM
and a lock server rather than CMAN/DLM/CLVM.  The problem is that the
GFS volume is an LVM volume on a SCSI-based SAN and, of course, I can't
"vgchange" it because clvmd isn't available in GULM.  I could use the
"--ignorelockingfailure" in the vgchange command, but that doesn't sound
safe to me and I'm concerned that if both nodes have to write to the
filesystem, bad, evil things will happen.

A second issue is that I performed the "gfs_mkfs" with the "-p lock_dlm"

So, my main questions are:

1. Can I continue to use an LVM device under GULM safely, and if so,
how?  I'd like to continue with LVM as there are times where this
filesystem will have to be "grown" as more content appears (in our
business, it's impossible to predict how much content there will be).

2. Do I have to destroy the filesystem and reformat it using the "-p lock_gulm" option?

As I said, this is a lab rat set up right now and wiping things out
to start over isn't a problem.  I must have reliable write activity
going on here, preferably from all cluster nodes or GFS isn't going to
work in our environment and I really need it or something like it.

Help me, Obi Wan Kenobis!  You're my only hope!
- Rick Stevens, Senior Systems Engineer     rstevens vitalstream com -
- VitalStream, Inc.                       http://www.vitalstream.com -
-                                                                    -
-        Change is inevitable, except from a vending machine.        -

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]