[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] GFS2 fatal: invalid metadata block

On Tue, 2009-10-20 at 10:07 +0100, Steven Whitehouse wrote:
> Hi,
> On Mon, 2009-10-19 at 16:30 -0600, Kai Meyer wrote:
> > Ok, so our lab test results have turned up some fun events.
> > 
> > Firstly, we were able to duplicate the invalid metadata block exactly 
> > under the following circumstances:
> > 
> > We wanted to monkey with the VLan that fenced/openais ran on. We failed 
> > miserably causing all three of my test nodes to believe that they became 
> > lone islands in the cluster, unable to get enough votes themselves to 
> > fence anybody. So we chose to simply power cycle the nodes with out 
> > trying to gracefully leave the cluster or reboot (they are diskless 
> > servers with NFS root filesystems so the GFS2 filesystem is the only 
> > thing we were risking corruption with.) After the nodes came back 
> > online, we began to see the same random reboots and filesystem withdraws 
> > within 24 hours. The filesystem taht went into production that 
> > eventually hit these errors was likely not reformatted just before 
> > putting into production, and I believe it is highly likely that the last 
> > format done on that production filesystem was done while we were still 
> > doing testing. I hope that as we continue in our lab, we can reproduce 
> > the same circumstances, and give you a step-by-step that will cause this 
> > issue. It'll make me feel much better about our current GFS2 filesystem 
> > that was created and unmounted cleanly by a single node, and then put 
> > straight into production, and has been only mounted once by our current 
> > production servers since it was formatted.
> > 
> That very interesting information. We are not there yet, but there are a
> number of useful hints in that. Any further information you are able to
> gather would be very interesting.
> > Secondly, the way our VMs are doing I/O, we have found the cluster.conf 
> > configuration settings:
> > <dlm plock_ownership="1" plock_rate_limit="0"/>
> > <gfs_controld plock_rate_limit="0"/>
> > have lowered our %wa times from ~60% to ~30% utilization. I am curious 
> > why the locking deamon is set to default to such a low number by default 
> > (100). Adding these two parameters in the cluster.conf raised our locks 
> > per second with the ping_pong binary from 93 to 3000+ in our 5 node 
> > cluster. Our throughput doesn't seem to improve by either upping the 
> > locking limit or setting up jumbo frames, but processes spend much less 
> > time in I/O wait state than before (if my munin graphs are believable). 
> > How likely is it that the low locking rate had a hand in causing the 
> > filesystem withdraws and 'invalid metadata block' errors?
> > 
> I think there would be an argument for setting the default rate limit to
> 0 (i.e. off) since we seem to spend so much time telling people to turn
> off this particular feature. The reason that it was added is that under
> certain circumstances it is possible to flood the network with plock
> requests resulting in the blocking of openais traffic (so the cluster
> thinks its been partitioned).
> I've not seen or heard of any recent reports of this, though, but that
> is the original reason the feature was added. Most applications tend to
> be I/O bound rather than (fcntl) lock bound anyway, so that the chances
> of it being a problem are fairly slim.

The reason the limiting was added was because the IPC system in original
openais in fc6/rhel5.0 would disconnect heavy users of ipc connections,
triggering a fencing operation of the node.  That problem has been
resolved since 5.3.z (also f11+).

> Setting jumbo frames won't help as the issue is one of latency rather
> than throughput (performance-wise). Using a low-latency interconnect in
> the cluster should help fcntl lock performance though.

jumbo frames reduces latency AND increases throughput from origination
to delivery for heavy message traffic.  For very light message traffic
latency is increased but throughput is still improved.

> The locking rate should have no bearing on the filesystem itself. The
> locking (fcntl only this refers to, btw) is performed in userspace by
> dlm_controld (gfs_controld on older clusters) and merely passed through
> the filesystem. The fcntl code is identical between gfs1 and gfs2.
> > I'm still not completely confident I won't see this happen again on my 
> > production servers. I'm hoping you can help me with that.
> > 
> Yes, so am I :-) It sounds like we are making progress if we can reduce
> the search space for the problem and it sounds very much from your
> message as if you believe that it is a recovery issue, and it sounds
> plausible to me,
> --
> Linux-cluster mailing list
> Linux-cluster redhat com
> https://www.redhat.com/mailman/listinfo/linux-cluster

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]