[Linux-cluster] Bad day in writesville - Followup

Rick Stevens rstevens at vitalstream.com
Wed Dec 22 23:27:30 UTC 2004


Rick Stevens wrote:
> Ken Preslan wrote:
> 
>> On Tue, Dec 21, 2004 at 05:19:30PM -0800, Rick Stevens wrote:
>>
>>> 2. Do I have to destroy the filesystem and reformat it using the "-p 
>>> lock_gulm" option?
>>
>>
>>
>> To change the module permanently, you can unmount the filesystem on all
>> nodes and run the commands:
>>
>> gfs_tool sb <device> proto <module>
>> gfs_tool sb <device> table <table>
>>
>> If it's just a temporary thing, you can unmount the filesystem on all
>> nodes and mount with options to override the defaults in the superblock:
>>
>> mount -o lockproto=<module>,locktable=<table>
>>
>>
>> <Module> is whatever you'd have for gfs_mkfs' -p and <table>
>> is whatever you'd have to gfs_mkfs' -t.
>>
>>
>> Note that you need to be careful when you do either of these things.
>> Running a mixed cluster where some machines are locking a FS with one
>> protocol/table and other machines are locking the same FS with a
>> different protocol/table is bad.  It is bound to end in great sadness.
> 
> 
> Gotcha.  After looking at the bugzilla entries refered to in other
> replies to my question, it appears that LVM works fairly well with DLM
> but has major issues with GULM.  However, someone mentioned that one can
> use DLM to lock LVM and use GULM to lock GFS.
> 
> Since the LVM stuff is pretty important, I intend to try that method.
> I rebuilt LVM to use DLM/CMAN.  I've modprobed both lock_dlm and
> lock_gulm and started clvmd.  I've used "cman_tool join" to set up DLM
> and "vgchange -aly"d the LVM volume and it appeared.  I've used gfs_tool
> to change the locking protocol on the filesystem to use GULM (the table
> remains the same) and it mounted fine.  I'm about to start the same
> stuff on the second node.  I'll keep you informed.

Followup:

Well, that seems to be the fix.  I used CMAN/DLM to manage the LVM
volumes via clvmd and I changed the GFS filesystem to use GULM as its
locking mechanism.  No hiccups so far, so it appears that heavy write
activity freaks out DLM when it's also managing GFS.

I wish I had traces and such to show you what was going on with CMAN/DLM
only, but the machine locked up so hard only a reboot could bring it
back.

I'm going to keep testing this so there will be an additional follow up.
I'd also like to thank Ken Preslan, Patrick Caulfield and Derek Anderson
for their help on this.  I may be much closer to a solution and you
chaps helped a lot.  Thanks!
----------------------------------------------------------------------
- Rick Stevens, Senior Systems Engineer     rstevens at vitalstream.com -
- VitalStream, Inc.                       http://www.vitalstream.com -
-                                                                    -
-    Reality: A crutch for those who can't handle science fiction    -
----------------------------------------------------------------------




More information about the Linux-cluster mailing list