[Linux-cluster] Probably some silly mistake setting up a cluster ?
Petr Tuma
petr.tuma at nenya.ms.mff.cuni.cz
Wed Feb 27 15:12:43 UTC 2008
Greetings,
I am trying to set up a cluster with (for now) two nodes, reason being
the semantic guarantees of GFS when accessing shared files (that is, I
am not interested in fault tolerance, performance or anything else).
Unfortunately, I keep running into all sorts of problems, for
example:
- After a few hours of intensive workload, the cluster sometimes
simply stops. All file system calls block, but things like cman_tool
status or group_tool status insist everything is all right. Soft reboot
is not possible due to various services waiting infinitely, after power
cycling fsck finds inconsistencies on the file system.
- Sometimes, when trying to execute a binary on the file system, I get
execvp returning permission denied when it should not, but when I try
again, everything is all right. I sometimes even observe this when
trying to start a script on the file system, as if the interpreter of
the script (which is on a different file system altogether) had wrong
permissions. Again, simply trying one more time makes everything work.
The config of the cluster seems relatively simple:
- i686 single CPU node
- file system device accessible over iSCSI
- cluster subnet (unfortunately) connected over OpenVPN
- x86_64 eight CPU virtual node
- file system device provided by host which uses iSCSI
- both nodes resolve into the same subnet using /etc/hosts
- nothing except a single GFS2 file system is mounted
- fencing uses fence_manual
- both nodes run Fedora 8
Config attached, not like there is anything unusual in it.
As an absolute novice, I am probably making some glaringly obvious silly
mistake which results in the very weird behavior described above, but
try as I might, I do not see anything that can cause this ?
Thanks for any advice, Petr
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.conf
Type: text/xml
Size: 711 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080227/9ccfa91e/attachment.xml>
More information about the Linux-cluster
mailing list