[Linux-cluster] generic error on gfs status

Mon Apr 9 20:20:56 UTC 2007

On Mon, Apr 09, 2007 at 12:39:33PM -0500, David M wrote:
> I am running a four node GFS cluster with about 20 services per node.  All
> four nodes belong to the same failover domain, and they each have a priority
> of 1.  My shared storage is an iSCSI SAN (on a dedicated switch).
> 
> Over the last 24 hours, /gfsdata has logged 90 "notices" in the daemon log
> stating that the gfs "status" check returned a generic error:
> clurgmgrd[6074]: <notice> status on clusterfs "/gfsdata" returned 1 (generic
> error)
> clurgmgrd[6074]: <notice> Stopping service service1_03
> clurgmgrd[6074]: <notice> Service service1_03 is recovering
> clurgmgrd[6074]: <notice> Recovering failed service service1_03
> clurgmgrd[6074]: <notice> Service service1_03 started
> 
> So, everytime /gfsdata returns a generic error, the rgmanager restarts a
> service.
> 
> Can anyone shed any light on why I might be losing my mount point or why gfs
> is returning a 1?

> dlm_lkb           3318298 3318298    232   17    1 : tunables  120   60    8
> : slabdata 195194 195194      0

^^ That is definitely 212644; use aforementioned packages to fix.

It is actually likely that in your case, the status check is failing
because of it because you're using GFS as part of a service.

-- Lon

-- 
Lon Hohberger - Software Engineer - Red Hat, Inc.