[Linux-cluster] GFS = filesystem consistency error

oly cluster at squiz.net
Wed Mar 1 03:02:11 UTC 2006


Hi there
        I've got a 4nodes RHEL4 cluster with GFS version  6.1.0 (built
        Jun  7
        2005 12:46:04).
        The shared disk is a NAS detected by aoe as /dev/etherd/e0.0.
        ANd i have problem on few files on teh file system : if i tried
        to
        modify the inodes o this files (delete the file, or unlink the
        inode)
        the cluster nodes where i launch the command lost the GFS and
        the GFS
        modules stay busy and cannot be remove from the kernel. my nodes
        is so
        stuck and the only solution is only to hardware restart this
        nodes.
         All the GFS journal seems to work fine ...i can even get stat
        of the
        DEAD file.
         Is GFS got problem to manipulate file in a 'more than 1 million
        files'
        folder ?
         IS anyone got a solution to remove this dead files or delete
        teh fodler
        that content all these dead files ?
         Is a gfs.fsck can resolv my problem ?
         Is there any later version that fix this problem ?
        
        Thanks in advance.
        PS = see below all the details
         
        The error i get when i try to unlink the file inode:
        ===========ERROR============
        GFS: fsid=entcluster:sataide.2: fatal: filesystem consistency
        error
        GFS: fsid=entcluster:sataide.2:   inode = 8516674/8516674
        GFS: fsid=entcluster:sataide.2:   function = gfs_change_nlink
        GFS: fsid=entcluster:sataide.2:   file
        = /usr/src/build/574067-i686/BUILD/smp/src/gfs/inode.c, line =
        843
        GFS: fsid=entcluster:sataide.2:   time = 1141080134
        GFS: fsid=entcluster:sataide.2: about to withdraw from the
        cluster
        GFS: fsid=entcluster:sataide.2: waiting for outstanding I/O
        GFS: fsid=entcluster:sataide.2: telling LM to withdraw
        lock_dlm: withdraw abandoned memory
        GFS: fsid=entcluster:sataide.2: withdrawn
          mh_magic = 0x01161970
          mh_type = 4
          mh_generation = 68
          mh_format = 400
          mh_incarn = 6
          no_formal_ino = 8516674
          no_addr = 8516674
          di_mode = 0664
          di_uid = 500
          di_gid = 500
          di_nlink = 0
          di_size = 0
          di_blocks = 1
          di_atime = 1141042636
          di_mtime = 1140001370
          di_ctime = 1140001370
          di_major = 0
          di_minor = 0
          di_rgrp = 8513987
          di_goal_rgrp = 8513987
          di_goal_dblk = 2682
          di_goal_mblk = 2682
          di_flags = 0x00000004
          di_payload_format = 0
          di_type = 1
          di_height = 0
          di_incarn = 0
          di_pad = 0
          di_depth = 0
          di_entries = 0
          no_formal_ino = 0
          no_addr = 0
          di_eattr = 0
          di_reserved =
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00
        ========END OF ERROR==========
        
        My cman status:
        ==========STATUS============
        Protocol version: 5.0.1
        Config version: 4
        Cluster name: entcluster
        Cluster ID: 42548
        Cluster Member: Yes
        Membership state: Cluster-Member
        Nodes: 4
        Expected_votes: 1
        Total_votes: 4
        Quorum: 3
        Active subsystems: 5
        Node name: XXX.domainX.tld
        Node addresses: x.x.x.x
        ========END CMAN=========
        
        My gfs_tool df :
        ============DF=========
        /home:
          SB lock proto = "lock_dlm"
          SB lock table = "entcluster:sataide"
          SB ondisk format = 1309
          SB multihost format = 1401
          Block size = 4096
          Journals = 4
          Resource Groups = 274
          Mounted lock proto = "lock_dlm"
          Mounted lock table = "entcluster:sataide"
          Mounted host data = ""
          Journal number = 0
          Lock module flags =
          Local flocks = FALSE
          Local caching = FALSE
          Oopses OK = FALSE
        
          Type           Total          Used           Free
        use%
        
        ------------------------------------------------------------------------
          inodes         100642         100642         0
        100%
          metadata       3842538        8527           3834011        0%
          data           13999476       2760327        11239149
        20%
        =============END DF =========
        Version of my modules :
        ========modules========
        CMAN 2.6.9-36.0 (built May 31 2005 12:15:02) installed
        DLM 2.6.9-34.0 (built Jun  2 2005 15:17:56) installed
        Lock_Harness 2.6.9-35.5 (built Jun  7 2005 12:42:30) installed
        GFS 2.6.9-35.5 (built Jun  7 2005 12:42:49) installed
        aoe: aoe_init: AoE v2.6-11 initialised.
        Lock_DLM (built Jun  7 2005 12:42:32) installed
        ========end modules========
        
        
        
        -- 
        Aurelien Lemaire (oly)
        http://www.squiz.net
        Sydney | Canberra | London
        92 Jarrett St Leichhardt, Sydney, NSW 2040
        T:+61 2 9568 6866 
        F:+61 2 9568 6733    




More information about the Linux-cluster mailing list