[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Bad day in writesville - Followup #2



Rick Stevens wrote:
Rick Stevens wrote:

Ken Preslan wrote:

On Tue, Dec 21, 2004 at 05:19:30PM -0800, Rick Stevens wrote:

2. Do I have to destroy the filesystem and reformat it using the "-p lock_gulm" option?




To change the module permanently, you can unmount the filesystem on all
nodes and run the commands:

gfs_tool sb <device> proto <module>
gfs_tool sb <device> table <table>

If it's just a temporary thing, you can unmount the filesystem on all
nodes and mount with options to override the defaults in the superblock:

mount -o lockproto=<module>,locktable=<table>


<Module> is whatever you'd have for gfs_mkfs' -p and <table> is whatever you'd have to gfs_mkfs' -t.


Note that you need to be careful when you do either of these things. Running a mixed cluster where some machines are locking a FS with one protocol/table and other machines are locking the same FS with a different protocol/table is bad. It is bound to end in great sadness.



Gotcha. After looking at the bugzilla entries refered to in other replies to my question, it appears that LVM works fairly well with DLM but has major issues with GULM. However, someone mentioned that one can use DLM to lock LVM and use GULM to lock GFS.

Since the LVM stuff is pretty important, I intend to try that method.
I rebuilt LVM to use DLM/CMAN.  I've modprobed both lock_dlm and
lock_gulm and started clvmd.  I've used "cman_tool join" to set up DLM
and "vgchange -aly"d the LVM volume and it appeared.  I've used gfs_tool
to change the locking protocol on the filesystem to use GULM (the table
remains the same) and it mounted fine.  I'm about to start the same
stuff on the second node.  I'll keep you informed.


Followup:

Well, that seems to be the fix.  I used CMAN/DLM to manage the LVM
volumes via clvmd and I changed the GFS filesystem to use GULM as its
locking mechanism.  No hiccups so far, so it appears that heavy write
activity freaks out DLM when it's also managing GFS.

I wish I had traces and such to show you what was going on with CMAN/DLM
only, but the machine locked up so hard only a reboot could bring it
back.

I'm going to keep testing this so there will be an additional follow up.
I'd also like to thank Ken Preslan, Patrick Caulfield and Derek Anderson
for their help on this.  I may be much closer to a solution and you
chaps helped a lot.  Thanks!

Followup #2:


This is cool!  Based on the 20 December 2004 CVS code, I'm using
CMAN/DLM to manage the VG and LVM locking and GULM to manage the GFS
locking.  It works like a charm--even having both nodes write to the
same file.

For the final test, I added a new LU off the SAN to the VG, extended
the LV by the size of the new LU and was able to grow the GFS filesystem
on top of that...all without having to shut anything down.

So, success is near!  Next, we plan to deploy this cluster into our real
environment and let our clients flog it.  If it crashes, our load
balancer will block access to it and I'll do a post-mortem at that
point.

Well, that's enough for now.  It's 14:15 on 23 December here in
California right now, and I'm skedattling until Monday (well, the laptop
is full of stuff to do on Sunday).

Thanks again for the help, gang!  Have a Happy Holiday.  Don't eat or
drink too much!
----------------------------------------------------------------------
- Rick Stevens, Senior Systems Engineer     rstevens vitalstream com -
- VitalStream, Inc.                       http://www.vitalstream.com -
-                                                                    -
-    First Law of Work:                                              -
-    If you can't get it done in the first 24 hours, work nights.    -
----------------------------------------------------------------------


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]