[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Re: Re: GFS/GFS2 problems with iozone



Date: Mon, 4 May 2009 11:05:20 -0400 (EDT)
From: Bob Peterson <rpeterso redhat com>
Subject: Re: [Linux-cluster] GFS/GFS2 problems with iozone
To: linux clustering <linux-cluster redhat com>
Message-ID:
	<494270654 5591241449520328 JavaMail root zmail06 collab prod int phx2 redhat com>
	
Content-Type: text/plain; charset=utf-8

----- "Michael O'Sullivan" <michael osullivan auckland ac nz> wrote:
| Hi everyone,
| | I am having some problems testing a GFS system using iozone. I am | running CentOS 2.6.18-128.1.6.el5 and have a two node cluster with a | GFS | installed on a shared iSCSI target. The GFS sits on top of a 1.79TB | clustered logical volume and can be mounted successfully on both | cluster | nodes. | | When using iozone to test performance everything goes smoothly until I | | get to a file size of 2GB and a record length of 2048. Then iozone | exits | with the error | | Error fwriting block 250, fd= 7 | | and (as far as I can tell) the GFS becomes corrupted | | fatal: invalid metadata block
| bh = 12912396 (magic)
| function = gfs_get_meta_buffer
| file =
| /builddir/build/BUILD/gfs-kmod-0.1.31/_kmod_build_/src/gfs/dio.c, | line = 1225 | | Can anyone shed some light on what is happening? | | Kind regards, Mike O'S

Hi Mike,

Are you running iozone on a single node or both simultaneously?
If it's running on two nodes, please make sure that both nodes have
the iSCSI target mounted with lock_dlm protocol (not lock_nolock).
Also, we need to make sure that they're not trying to use the same
files in the file system because I think iozone is not cluster-aware.
But even so, the file system should not be corrupted unless one of
the nodes is using lock_nolock protocol, or if other boxes are
using the iSCSI target without the knowledge of GFS.

We regularly run iozone here, in single-node performance trials, and
we have never seen this kind of problem.

Also, you didn't specify what version of the kmod-gfs package you have
installed.  I've fixed at least one bug that might account for it,
depending on what version of kmod-gfs you're running.

I'm not aware of any other problems in the GFS kernel code that can
account for this kind of corruption, except for possibly this one:

https://bugzilla.redhat.com/show_bug.cgi?id=491369

(A gfs bug that really goes well beyond the nfs usage described in the bug).
You can find the patch in the attachments, although I won't guarantee
it'll solve your problem.  There's a slight chance though.
My apologies if you don't have permission to see the bug; that sometimes
happens and it's out of my control.  I can, however, post the patch
if needed.

If iozone is being run on a single node, this might be a new bug.  If you can
still recreate the problem with that patch in place, or if you don't want
to try the patch for some reason, perhaps you should open up a bugzilla
record and we'll investigate the problem.  If we can reproduce it, we'll
figure it out and fix it.

Regards,

Bob Peterson
Red Hat GFS
Hi Bob,

I have changed back to GFS2 (as I realised this is now production ready, is that correct?), but I am still having similar problems. I am running iozone on a single node and accessing the mount point of GFS2 running with lock_dlm. Note that the GFS2 is created on a multipathed device created via iSCSI/DRBD. However, I run the following commands:

gfs2_fsck # which shows no errors on either node

mount -t gfs2 /dev/iscsi_mirror/lvol0 /mnt/iscsi_mirror/ #mounts the file system (on top of iSCSI/DRBD) on both nodes

/usr/src/ioszone3_321/src/current/iozone -Ra -g 4G -f /mnt/iscsi_mirror/test # Only on node 1

This gets to 1048576 KB and reclen 256 before giving

Error reading block 1018 b6e00000

I can fix the GFS2 using gfs2_fsck (it fixes some dirty journals, but no other changes). I don't have the error messages from this latest test as I ran it over the weekend and /var/log/messages doesn't have the error messages anymore. I can recreate this test and record the error messages if necessary, but I wonder if the patch you talked about also exists for GFS2?

Thanks very much for your help, Mike


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]