[Linux-cluster] GFS filesystem "hang" with cluster-1.03.00

Hi List,

I'm hoping someone can provide me with pointers to solve the following problem:

I've setup a cluster of 5 nodes with cluster-1.03.00 compiled from source. The cluster works fine.
I can fence the nodes and all nodes see each other.

I've created a gfs filesystem on a coraid shared device using clvm

All nodes see the filesystem and see small changes to the filesystem.

Last night I started a stress-test with iozone on all nodes:

mkdir /mnt/$(hostname)
iozone -Rab /home/iozone-$(hostname)-test-${DATE}.xls -i0 -g16G -f /mnt/$(hostname)/iozone-test${DATE}

This test started at 4 AM and is still running on all nodes.

If I run the same test on a single node it produces a nice test-report indicating that we get an average write performance of 35MB/s.
This is within expectations of the hardware with the current setup.

Most operations on the gfs filesystem take long the first time, gfs_tool counters /mnt takes roughly a minute the first time, afterwards they react normal, within a second response. The same is true for operations like ls, df, etc.

I have no clue why concurrent writes hang and would appreciate any pointers on where to start looking.

x86_64 Intel(R) Xeon(R) CPU 5140  @ 2.33GHz

Thank you,

Ramon van Alteren

