[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] GFS file system corruption?



I have 6 computers running Redhat Enterrpise 3 release 5, and kernel 2.4.21-32.0.1.ELsmp.

I compiled GFS 6.0.2.20-2 from the source code.  The SAN is an ISCSI based storage system from LeftHand Networks.   Using ext3, the postmark disk test works fine, on a GFS file system, we get a number of errors.  The output from both postmark runs is below.

I unmounted the file systems, and ran gfs_fsck on the GFS system.  It produced a number of errors like these:
[root imagine root]# gfs_fsck -y /dev/pool/u_as
Initializing fsck
Starting pass1
Pass1 complete
Starting pass1b
Pass1b complete
Starting pass1c
Pass1c complete
Starting pass2
Pass2 complete
Starting pass3
Pass3 complete
Starting pass4
Pass4 complete
Starting pass5
ondisk and fsck bitmaps differ at block 17887
Succeeded.
ondisk and fsck bitmaps differ at block 17888
Succeeded.
ondisk and fsck bitmaps differ at block 17889
Succeeded.
ondisk and fsck bitmaps differ at block 17890
Succeeded.
ondisk and fsck bitmaps differ at block 17891
Succeeded.
ondisk and fsck bitmaps differ at block 17892
Succeeded.
ondisk and fsck bitmaps differ at block 17893
Succeeded.
ondisk and fsck bitmaps differ at block 17894
Succeeded.
ondisk and fsck bitmaps differ at block 17895
Succeeded.
ondisk and fsck bitmaps differ at block 17896
Succeeded.
ondisk and fsck bitmaps differ at block 17897
Succeeded.
ondisk and fsck bitmaps differ at block 17898
Succeeded.
ondisk and fsck bitmaps differ at block 17899
Succeeded.
ondisk and fsck bitmaps differ at block 17900
Succeeded.
ondisk and fsck bitmaps differ at block 17901
Succeeded.
The complete output was from gfs_fsck was 935k.

I have included the the cluster configuration files below.  Fencing is handled by a perl script that I wrote.  It uses SNMP to turn off the ports in a Cisco 3750 switch.  There were no log entries from GULM or GFS on any of the hosts.

PostMark is available from http://www.netapp.com/tech_library/3022.html.

Does any body have any ideas what would do this?

Other GFS file systems on these servers have had similar problems.  It seems that gfs_fsck repairs bitmap errors after some number of file creates and deletes.  PostMark is the only program that I have used on GFS that reports errors.  The EXT3 fsck was clean after PostMark was ran.

thank you for your assistance.

Matt Brookover
Academic Computing and Networking
Colorado School of Mines
303-273-3436
mbrookov mines edu

PostMark output when running on EXT3 file system:
PostMark v1.5 : 3/27/01
pm>set size 10000 100000000
pm>run
Creating files...Done
Performing transactions..........Done
Deleting files...Done
Time:
        1941 seconds total
        1350 seconds of transactions (0 per second)

Files:
        771 created (0 per second)
                Creation alone: 500 files (0 per second)
                Mixed with transactions: 271 files (0 per second)
        247 read (0 per second)
        253 appended (0 per second)
        771 deleted (0 per second)
                Deletion alone: 542 files (27 per second)
                Mixed with transactions: 229 files (0 per second)

Data:
        13100.09 megabytes read (6.75 megabytes per second)
        42926.13 megabytes written (22.12 megabytes per second)
pm>exit
PostMark output when running on GFS file system:
PostMark v1.5 : 3/27/01
pm>set size 10000 100000000
pm>run
Creating files...Done
Performing transactions....Error: cannot open '615' for writing
Error: cannot open '616' for writing
Error: cannot open '617' for writing
Error: cannot open '619' for writing
Error: cannot open '623' for writing
.Error: cannot open '624' for writing
Error: cannot open '625' for writing
Error: cannot open '626' for writing
Error: cannot open '629' for writing
Error: cannot open '633' for writing
Error: cannot open '634' for writing
Error: cannot open '635' for writing
Error: cannot open '636' for writing
Error: cannot open '637' for writing
Error: cannot open '641' for writing
Error: cannot open '642' for writing
Error: cannot open '643' for writing
Error: cannot open '644' for writing
Error: Cannot delete '637'
Error: Cannot delete '615'
Error: Cannot delete '634'
.Error: cannot open '650' for writing
Error: Cannot delete '625'
Error: cannot open '667' for writing
Error: cannot open '668' for writing
Error: cannot open '669' for writing
.Error: cannot open '687' for writing
.Error: cannot open '696' for writing
Error: cannot open '709' for writing
Error: cannot open '712' for writing
Error: cannot open '719' for writing
Error: cannot open '720' for writing
.Error: cannot open '721' for writing
Error: cannot open '642' for reading
Error: cannot open '722' for writing
Error: cannot open '626' for append
Error: cannot open '642' for append
Error: Cannot delete '624'
Error: cannot open '731' for writing
Error: cannot open '735' for writing
Error: cannot open '736' for writing
Error: cannot open '720' for append
Error: cannot open '737' for writing
Error: cannot open '741' for writing
Error: cannot open '742' for writing
Error: cannot open '743' for writing
Error: cannot open '744' for writing
Error: cannot open '746' for writing
Error: cannot open '748' for writing
Error: Cannot delete '743'
.Error: Cannot delete '721'
Error: cannot open '741' for reading
Error: Cannot delete '719'
Error: cannot open '755' for writing
Error: cannot open '756' for writing
Error: cannot open '641' for reading
Error: cannot open '760' for writing
Error: cannot open '743' for reading
Error: Cannot delete '636'
Error: Cannot delete '669'
Error: cannot open '687' for reading
Done
Deleting files...Error: Cannot delete '615'
Error: Cannot delete '617'
Error: Cannot delete '619'
Error: Cannot delete '623'
Error: Cannot delete '624'
Error: Cannot delete '625'
Error: Cannot delete '626'
Error: Cannot delete '629'
Error: Cannot delete '633'
Error: Cannot delete '634'
Error: Cannot delete '635'
Error: Cannot delete '636'
Error: Cannot delete '637'
Error: Cannot delete '641'
Error: Cannot delete '642'
Error: Cannot delete '643'
Error: Cannot delete '644'
Error: Cannot delete '667'
Error: Cannot delete '668'
Error: Cannot delete '669'
Error: Cannot delete '687'
Error: Cannot delete '696'
Error: Cannot delete '709'
Error: Cannot delete '712'
Error: Cannot delete '719'
Error: Cannot delete '720'
Error: Cannot delete '721'
Error: Cannot delete '722'
Error: Cannot delete '731'
Error: Cannot delete '735'
Error: Cannot delete '736'
Error: Cannot delete '737'
Error: Cannot delete '741'
Error: Cannot delete '742'
Error: Cannot delete '743'
Error: Cannot delete '744'
Error: Cannot delete '746'
Error: Cannot delete '748'
Error: Cannot delete '755'
Error: Cannot delete '756'
Error: Cannot delete '760'
Done
Time:
        1773 seconds total
        1086 seconds of transactions (0 per second)

Files:
        768 created (0 per second)
                Creation alone: 500 files (0 per second)
                Mixed with transactions: 268 files (0 per second)
        239 read (0 per second)
        253 appended (0 per second)
        768 deleted (0 per second)
                Deletion alone: 505 files (16 per second)
                Mixed with transactions: 222 files (0 per second)

Data:
        12812.81 megabytes read (7.23 megabytes per second)
        40532.43 megabytes written (22.86 megabytes per second)
pm>exit
Cluster.css file:
cluster
{
        name = "CSM_ACN"
        lock_gulm
        {
                servers = ["imagine.Mines.EDU","illuminate.Mines.EDU","illusion.Mines.EDU"]
                heartbeat_rate = 3.0
                allowed_misses = 5
        }
}
fence.css file:
fence_devices
{
        CSMACN_fence
        {
                agent = "fence_cisco"
        }
}
Nodes.css file:
nodes
{
        imagine.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.1"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="imagine"
                                }
                        }
                }
        }

        illuminate.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.2"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="illuminate"
                                }
                        }
                }
        }

        illusion.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.3"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="illusion"
                                }
                        }
                }
        }

        inspire.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.5"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="inspire"
                                }
                        }
                }
        }
        inception.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.4"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="inception"
                                }
                        }
                }
        }
        incantation.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.6"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="incantation"
                                }
                        }
                }
        }
}


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]