[Linux-cluster] Continuing gfs2 problems: Am I doing something wrong????

Thu Sep 16 20:43:56 UTC 2010

  I'm having an identical problem.

I have 2 nodes running a Wordpress instance with a TCP load balancer in 
front of them distributing http requests between them.

In the last 2 days, I've had 10+ instances where the GFS2 volume hangs 
with:

Sep 16 14:05:10 wordpress3 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 16 14:05:10 wordpress3 kernel: delete_workqu D 00000272  2676  
3687     19          3688  3686 (L-TLB)
Sep 16 14:05:10 wordpress3 kernel:        f7839e38 00000046 3f1c322e 
00000272 00000000 f57ab400 f7839df8 0000000a
Sep 16 14:05:10 wordpress3 kernel:        c3217aa0 3f1dcca8 00000272 
00019a7a 00000001 c3217bac c3019744 f57c5ac0
Sep 16 14:05:10 wordpress3 kernel:        f8afa21c 00000003 f26162f0 
00000000 f2213df8 00000018 c3019c00 f7839e6c
Sep 16 14:05:10 wordpress3 kernel: Call Trace:
Sep 16 14:05:10 wordpress3 kernel:  [<f8afa21c>] gdlm_bast+0x0/0x78 
[lock_dlm]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c3910e>] just_schedule+0x5/0x8 
[gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c061d2f5>] __wait_on_bit+0x33/0x58
Sep 16 14:05:10 wordpress3 kernel:  [<f8c39109>] just_schedule+0x0/0x8 
[gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c39109>] just_schedule+0x0/0x8 
[gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c061d37c>] 
out_of_line_wait_on_bit+0x62/0x6a
Sep 16 14:05:10 wordpress3 kernel:  [<c0436098>] wake_bit_function+0x0/0x3c
Sep 16 14:05:10 wordpress3 kernel:  [<f8c39102>] 
gfs2_glock_wait+0x27/0x2e [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c4c667>] 
gfs2_check_blk_type+0xbc/0x18c [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c061d312>] __wait_on_bit+0x50/0x58
Sep 16 14:05:10 wordpress3 kernel:  [<f8c39109>] just_schedule+0x0/0x8 
[gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c4c660>] 
gfs2_check_blk_type+0xb5/0x18c [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c4c3c8>] 
gfs2_rindex_hold+0x2b/0x148 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c48273>] 
gfs2_delete_inode+0x6f/0x1a1 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c4823b>] 
gfs2_delete_inode+0x37/0x1a1 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c48204>] 
gfs2_delete_inode+0x0/0x1a1 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c048cb02>] 
generic_delete_inode+0xa5/0x10f
Sep 16 14:05:10 wordpress3 kernel:  [<c048c5a6>] iput+0x64/0x66
Sep 16 14:05:10 wordpress3 kernel:  [<f8c3a8bb>] 
delete_work_func+0x49/0x53 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c04332da>] run_workqueue+0x78/0xb5
Sep 16 14:05:10 wordpress3 kernel:  [<f8c3a872>] 
delete_work_func+0x0/0x53 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c0433b8e>] worker_thread+0xd9/0x10b
Sep 16 14:05:10 wordpress3 kernel:  [<c041f81b>] 
default_wake_function+0x0/0xc
Sep 16 14:05:10 wordpress3 kernel:  [<c0433ab5>] worker_thread+0x0/0x10b
Sep 16 14:05:10 wordpress3 kernel:  [<c0435fa7>] kthread+0xc0/0xed
Sep 16 14:05:10 wordpress3 kernel:  [<c0435ee7>] kthread+0x0/0xed
Sep 16 14:05:10 wordpress3 kernel:  [<c0405c53>] 
kernel_thread_helper+0x7/0x10

And then a bunch more for the httpd processes. I can pretty much 
reproduce this consistently by untarring a large tarball on the volume. 
Seems like anything IO intensive is causing this behavior.

Running CentOS 5.5 with kernel 2.6.18-194.11.1.el5 #1 SMP Tue Aug 10 
19:09:06 EDT 2010 i686 i686 i386 GNU/Linux

I tried the hangalizer program and it always came back with:
/bin/ls: /gfs2/: No such file or directoryhb.medianewsgroup.com "/bin/ls 
/gfs2/"
/bin/ls: /gfs2/: No such file or directoryhb.medianewsgroup.com "/bin/ls 
/gfs2/"
No waiting glocks found on any node.

Any Ideas?

On 08/03/2010 01:38 PM, Scooter Morris wrote:
> HI all,
>     We continue to have gfs2 crashes and hangs on our production 
> cluster, so I'm beginning to think that we've done something really 
> wrong.  Here is our set-up:
>
>     * 4 node cluster, only 3 participate in gfs2 filesystems
>     * Running several services on multiple nodes using gfs2:
>           o IMAP (dovecot)
>           o Web (apache with lots of python)
>           o Samba (using ctdb)
>     * GFS2 partitions are multipathed on an HP EVA-based SAN (no LVM)
>       -- here is fstab from one node (the three nodes are all the same):
>
>         LABEL=/1                /                       ext3   
>         defaults        1 1
>         LABEL=/boot1            /boot                   ext3   
>         defaults        1 2
>         tmpfs                   /dev/shm                tmpfs  
>         defaults        0 0
>         devpts                  /dev/pts                devpts 
>         gid=5,mode=620  0 0
>         sysfs                   /sys                    sysfs  
>         defaults        0 0
>         proc                    /proc                   proc   
>         defaults        0 0
>         LABEL=SW-cciss/c0d0p2   swap                    swap   
>         defaults        0 0
>         LABEL=plato:Mail        /var/spool/mail         gfs2   
>         noatime,_netdev
>         LABEL=plato:VarTmp      /var/tmp                gfs2    _netdev
>         LABEL=plato:UsrLocal    /usr/local              gfs2   
>         noatime,_netdev
>         LABEL=plato:UsrLocalProjects /usr/local/projects gfs2  
>         noatime,_netdev
>         LABEL=plato:Home2       /home/socr              gfs2   
>         noatime,_netdev
>         LABEL=plato:HomeNoBackup /home/socr/nobackup    gfs2    _netdev
>         LABEL=plato:DbBackup    /databases/backups      gfs2   
>         noatime,_netdev
>         LABEL=plato:DbMol       /databases/mol          gfs2   
>         noatime,_netdev
>         LABEL=plato:MolDbBlast  /databases/mol/blast    gfs2   
>         noatime,_netdev
>         LABEL=plato:MolDbEmboss /databases/mol/emboss   gfs2   
>         noatime,_netdev
>
>     * Kernel version is: 2.6.18-194.3.1.el5 and all nodes are x86_64.
>     * What's happening is every so often, we start seeing gfs2-related
>       task hangs in the logs.  In the last instance (last Friday)
>       we've got this:
>
>         Node 0:
>
>             [2010-07-30 13:23:25]INFO: task imap:25716 blocked for
>             more than 120 seconds.^M
>             [2010-07-30 13:23:25]"echo 0 >
>             /proc/sys/kernel/hung_task_timeout_secs" disables this
>             message.^M
>             [2010-07-30 13:23:25]imap          D ffff8100010825a0    
>             0 25716   9217         24080 25667 (NOTLB)^M
>             [2010-07-30 13:23:25] ffff810619b59bc8 0000000000000086
>             ffff810113233f10 ffffffff00000000^M
>             [2010-07-30 13:23:26] ffff81000f8c5cd0 000000000000000a
>             ffff810233416040 ffff81082fd05100^M
>             [2010-07-30 13:23:26] 00012196d153c88e 0000000000008b81
>             ffff810233416228 0000000f6a949180^M
>             [2010-07-30 13:23:26]Call Trace:^M
>             [2010-07-30 13:23:26] [<ffffffff887d0be6>]
>             :gfs2:gfs2_dirent_find+0x0/0x4e^M
>             [2010-07-30 13:23:26] [<ffffffff887d0c18>]
>             :gfs2:gfs2_dirent_find+0x32/0x4e^M
>             [2010-07-30 13:23:26] [<ffffffff887d5ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:23:26] [<ffffffff887d5ef0>]
>             :gfs2:just_schedule+0x9/0xe^M
>             [2010-07-30 13:23:26] [<ffffffff80063a16>]
>             __wait_on_bit+0x40/0x6e^M
>             [2010-07-30 13:23:26] [<ffffffff887d5ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:23:26] [<ffffffff80063ab0>]
>             out_of_line_wait_on_bit+0x6c/0x78^M
>             [2010-07-30 13:23:26] [<ffffffff800a0aec>]
>             wake_bit_function+0x0/0x23^M
>             [2010-07-30 13:23:26] [<ffffffff887d5ee2>]
>             :gfs2:gfs2_glock_wait+0x2b/0x30^M
>             [2010-07-30 13:23:26] [<ffffffff887e579e>]
>             :gfs2:gfs2_permission+0x83/0xd5^M
>             [2010-07-30 13:23:26] [<ffffffff887e5796>]
>             :gfs2:gfs2_permission+0x7b/0xd5^M
>             [2010-07-30 13:23:26] [<ffffffff8000ce97>]
>             do_lookup+0x65/0x1e6^M
>             [2010-07-30 13:23:26] [<ffffffff8000d918>]
>             permission+0x81/0xc8^M
>             [2010-07-30 13:23:26] [<ffffffff8000997f>]
>             __link_path_walk+0x173/0xf42^M
>             [2010-07-30 13:23:26] [<ffffffff8000e9e2>]
>             link_path_walk+0x42/0xb2^M
>             [2010-07-30 13:23:26] [<ffffffff8000ccb2>]
>             do_path_lookup+0x275/0x2f1^M
>             [2010-07-30 13:23:26] [<ffffffff8001280e>]
>             getname+0x15b/0x1c2^M
>             [2010-07-30 13:23:27] [<ffffffff80023876>]
>             __user_walk_fd+0x37/0x4c^M
>             [2010-07-30 13:23:27] [<ffffffff80028846>]
>             vfs_stat_fd+0x1b/0x4a^M
>             [2010-07-30 13:23:27] [<ffffffff800638b3>]
>             schedule_timeout+0x92/0xad^M
>             [2010-07-30 13:23:27] [<ffffffff80097dab>]
>             process_timeout+0x0/0x5^M
>             [2010-07-30 13:23:27] [<ffffffff800f8435>]
>             sys_epoll_wait+0x3b8/0x3f9^M
>             [2010-07-30 13:23:27] [<ffffffff800235a8>]
>             sys_newstat+0x19/0x31^M
>             [2010-07-30 13:23:27] [<ffffffff8005d229>]
>             tracesys+0x71/0xe0^M
>             [2010-07-30 13:23:27] [<ffffffff8005d28d>]
>             tracesys+0xd5/0xe0^M
>
>         Node 1:
>
>             [2010-07-30 13:23:59]INFO: task pdflush:623 blocked for
>             more than 120 seconds.^M
>             [2010-07-30 13:23:59]"echo 0 >
>             /proc/sys/kernel/hung_task_timeout_secs" disables this
>             message.^M
>             [2010-07-30 13:23:59]pdflush       D ffff810407069aa0    
>             0   623    291           624   622 (L-TLB)^M
>             [2010-07-30 13:23:59] ffff8106073c1bd0 0000000000000046
>             0000000000000001 ffff8103fea899a8^M
>             [2010-07-30 13:23:59] ffff8106073c1c30 000000000000000a
>             ffff8105fff7c0c0 ffff8107fff4c820^M
>             [2010-07-30 13:24:00] 0000ed85d9d7a027 0000000000011b50
>             ffff8105fff7c2a8 00000006f0a9d0d0^M
>             [2010-07-30 13:24:00]Call Trace:^M
>             [2010-07-30 13:24:00] [<ffffffff8001a927>]
>             submit_bh+0x10a/0x111^M
>             [2010-07-30 13:24:00] [<ffffffff88802ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:24:00] [<ffffffff88802ef0>]
>             :gfs2:just_schedule+0x9/0xe^M
>             [2010-07-30 13:24:00] [<ffffffff80063a16>]
>             __wait_on_bit+0x40/0x6e^M
>             [2010-07-30 13:24:00] [<ffffffff88802ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:24:00] [<ffffffff80063ab0>]
>             out_of_line_wait_on_bit+0x6c/0x78^M
>             [2010-07-30 13:24:00] [<ffffffff800a0aec>]
>             wake_bit_function+0x0/0x23^M
>             [2010-07-30 13:24:00] [<ffffffff88802ee2>]
>             :gfs2:gfs2_glock_wait+0x2b/0x30^M
>             [2010-07-30 13:24:00] [<ffffffff88813269>]
>             :gfs2:gfs2_write_inode+0x5f/0x152^M
>             [2010-07-30 13:24:00] [<ffffffff88813261>]
>             :gfs2:gfs2_write_inode+0x57/0x152^M
>             [2010-07-30 13:24:00] [<ffffffff8002fbf8>]
>             __writeback_single_inode+0x1e9/0x328^M
>             [2010-07-30 13:24:00] [<ffffffff80020ec9>]
>             sync_sb_inodes+0x1b5/0x26f^M
>             [2010-07-30 13:24:00] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:00] [<ffffffff8005123a>]
>             writeback_inodes+0x82/0xd8^M
>             [2010-07-30 13:24:00] [<ffffffff800c97b5>]
>             wb_kupdate+0xd4/0x14e^M
>             [2010-07-30 13:24:00] [<ffffffff80056879>] pdflush+0x0/0x1fb^M
>             [2010-07-30 13:24:00] [<ffffffff800569ca>]
>             pdflush+0x151/0x1fb^M
>             [2010-07-30 13:24:00] [<ffffffff800c96e1>]
>             wb_kupdate+0x0/0x14e^M
>             [2010-07-30 13:24:01] [<ffffffff80032894>]
>             kthread+0xfe/0x132^M
>             [2010-07-30 13:24:01] [<ffffffff8009d734>]
>             request_module+0x0/0x14d^M
>             [2010-07-30 13:24:01] [<ffffffff8005dfb1>]
>             child_rip+0xa/0x11^M
>             [2010-07-30 13:24:01] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:01] [<ffffffff80032796>] kthread+0x0/0x132^M
>             [2010-07-30 13:24:01] [<ffffffff8005dfa7>]
>             child_rip+0x0/0x11^M
>
>         Node 2:
>
>             [2010-07-30 13:24:46]INFO: task delete_workqueu:7175
>             blocked for more than 120 seconds.^M
>             [2010-07-30 13:24:46]"echo 0 >
>             /proc/sys/kernel/hung_task_timeout_secs" disables this
>             message.^M
>             [2010-07-30 13:24:46]delete_workqu D ffff81082b5cf860    
>             0  7175    329          7176  7174 (L-TLB)^M
>             [2010-07-30 13:24:46] ffff81081ed6dbf0 0000000000000046
>             0000000000000018 ffffffff887a84f3^M
>             [2010-07-30 13:24:46] 0000000000000286 000000000000000a
>             ffff81082dd477e0 ffff81082b5cf860^M
>             [2010-07-30 13:24:46] 00012166bf7ec21d 000000000002ed0b
>             ffff81082dd479c8 00000007887a9e5a^M
>             [2010-07-30 13:24:46]Call Trace:^M
>             [2010-07-30 13:24:46] [<ffffffff887a84f3>]
>             :dlm:request_lock+0x93/0xa0^M
>             [2010-07-30 13:24:47] [<ffffffff8884f556>]
>             :lock_dlm:gdlm_ast+0x0/0x311^M
>             [2010-07-30 13:24:47] [<ffffffff8884f2c1>]
>             :lock_dlm:gdlm_bast+0x0/0x8d^M
>             [2010-07-30 13:24:47] [<ffffffff887d3ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:24:47] [<ffffffff887d3ef0>]
>             :gfs2:just_schedule+0x9/0xe^M
>             [2010-07-30 13:24:47] [<ffffffff80063a16>]
>             __wait_on_bit+0x40/0x6e^M
>             [2010-07-30 13:24:47] [<ffffffff887d3ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:24:47] [<ffffffff80063ab0>]
>             out_of_line_wait_on_bit+0x6c/0x78^M
>             [2010-07-30 13:24:47] [<ffffffff800a0aec>]
>             wake_bit_function+0x0/0x23^M
>             [2010-07-30 13:24:47] [<ffffffff887d3ee2>]
>             :gfs2:gfs2_glock_wait+0x2b/0x30^M
>             [2010-07-30 13:24:47] [<ffffffff887e82cf>]
>             :gfs2:gfs2_check_blk_type+0xd7/0x1c9^M
>             [2010-07-30 13:24:47] [<ffffffff887e82c7>]
>             :gfs2:gfs2_check_blk_type+0xcf/0x1c9^M
>             [2010-07-30 13:24:47] [<ffffffff80063ab0>]
>             out_of_line_wait_on_bit+0x6c/0x78^M
>             [2010-07-30 13:24:47] [<ffffffff887e804f>]
>             :gfs2:gfs2_rindex_hold+0x32/0x12b^M
>             [2010-07-30 13:24:47] [<ffffffff887d5a29>]
>             :gfs2:delete_work_func+0x0/0x65^M
>             [2010-07-30 13:24:47] [<ffffffff887d5a29>]
>             :gfs2:delete_work_func+0x0/0x65^M
>             [2010-07-30 13:24:47] [<ffffffff887e3e3a>]
>             :gfs2:gfs2_delete_inode+0x76/0x1b4^M
>             [2010-07-30 13:24:47] [<ffffffff887e3e01>]
>             :gfs2:gfs2_delete_inode+0x3d/0x1b4^M
>             [2010-07-30 13:24:47] [<ffffffff8000d3ba>] dput+0x2c/0x114^M
>             [2010-07-30 13:24:48] [<ffffffff887e3dc4>]
>             :gfs2:gfs2_delete_inode+0x0/0x1b4^M
>             [2010-07-30 13:24:48] [<ffffffff8002f35e>]
>             generic_delete_inode+0xc6/0x143^M
>             [2010-07-30 13:24:48] [<ffffffff887d5a83>]
>             :gfs2:delete_work_func+0x5a/0x65^M
>             [2010-07-30 13:24:48] [<ffffffff8004d8f0>]
>             run_workqueue+0x94/0xe4^M
>             [2010-07-30 13:24:48] [<ffffffff8004a12b>]
>             worker_thread+0x0/0x122^M
>             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:48] [<ffffffff8004a21b>]
>             worker_thread+0xf0/0x122^M
>             [2010-07-30 13:24:48] [<ffffffff8008d087>]
>             default_wake_function+0x0/0xe^M
>             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:48] [<ffffffff80032894>]
>             kthread+0xfe/0x132^M
>             [2010-07-30 13:24:48] [<ffffffff8005dfb1>]
>             child_rip+0xa/0x11^M
>             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:48] [<ffffffff80032796>] kthread+0x0/0x132^M
>             [2010-07-30 13:24:48] [<ffffffff8005dfa7>]
>             child_rip+0x0/0x11^M
>
>     * Various messages related to hung_task_timeouts repeated on each
>       node (usually related to imap).
>     * Within a minute or two, the cluster was completely hung.  Root
>       could log into the console, but commands (like dmesg) would just
>       hang.
>
> So, my major question:  is there something wrong with my 
> configuration?  Have we done something really stupid?  The initial 
> response from RedHat was that we shouldn't run services on multiple 
> nodes that access gfs2, which seems a little confusing since we would 
> use ext3 or ext4 if we were going to node lock (or failover) the 
> partitions.  Have we missed something somewhere?
>
> Thanks in advance for any help anyone can give.  We're getting pretty 
> desperate here since the downtime is starting to have a significant 
> impact on our credibility.
>
> -- scooter
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Jeff Howell
Sr. Linux Administrator
Media News Group interactive
303.563.6394 jhowell at medianewsgroup.com