bigendian+gfs gmail com wrote:
I've just set up a new two-node GFS cluster on a CORAID sr1520
ATA-over-Ethernet. My nodes are each quad dual-core Opteron CPU
systems with 32GB RAM each. The CORAID unit exports a 1.6TB block
device that I have a GFS file system on.
I seem to be having performance issues where certain read system
calls take up to three seconds to complete. My test app is
bonnie++, and the slow-downs appear to be happen in the "Rewriting"
portion of the test, though I'm not sure if this is exclusive. If I
watch top and iostat for the device in question, I see activity on
the device, then long (up to three second) periods of no apparent
I/O. During the periods of no I/O the bonnie++ process is blocked
on disk I/O, so it seems that the system it trying to do something.
Network traces seem to show that the host machine is not waiting on
the RAID array, and the packet following the dead-period seems to
always be sent from the host to the coraid device. Unfortunately, I
don't know how to dig in any deeper to figure out what the problem is.