[Linux-cluster] gfs2_logd eating 99% io, random filesystem freezes
Kveri
kveri at kveri.com
Sun Sep 2 15:02:01 UTC 2012
Hello,
we're using gfs2 on drbd, we created cluster in incomplete state (only 1 node). When doing dd if=/dev/zero of=/gfs_partition/file we get filesystem freezes every 1-2 minutes for 10-20 seconds, I mean every filesystem on that machine freezes, doing ls /etc hangs in D state for 10-20 seconds. Sometimes this hang last for more than 2 minutes and hung task message gets logged in dmesg.
iotop shows gfs2_logd and flush-XXX:X kernel process taking 99% io resources.
GFS is mounted with rw,noatime,nodiratime,hostdata=jid=0 options.
gettune options:
quota_warn_period = 10
quota_quantum = 60
max_readahead = 262144
complain_secs = 10
statfs_slow = 0
quota_simul_sync = 64
statfs_quantum = 30
quota_scale = 1.0000 (1, 1)
new_files_jdata = 0
Server is kernel 3.2.0-25 64bit.
Dmesg error (we did echo 1 > /proc/sys/kernel/hung_task_timeout_secs, but we also tested it with 120 secs):
[ 818.882147] INFO: task ls:3531 blocked for more than 1 seconds.
[ 818.882479] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 818.882929] ls D ffff8803639364e0 0 3531 3269 0x00000000
[ 818.882932] ffff88033c789c58 0000000000000082 ffff88033c789be8 ffff8801e9c33780
[ 818.882936] ffff88033c789fd8 ffff88033c789fd8 ffff88033c789fd8 0000000000013780
[ 818.882940] ffff8801e5a72e00 ffff8801e5b32e00 0000000000000286 ffff88033c789ce0
[ 818.882943] Call Trace:
[ 818.882950] [<ffffffffa02d2300>] ? gfs2_glock_demote_wait+0x20/0x20 [gfs2]
[ 818.882953] [<ffffffff816579cf>] schedule+0x3f/0x60
[ 818.882959] [<ffffffffa02d230e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
[ 818.882963] [<ffffffff8165829f>] __wait_on_bit+0x5f/0x90
[ 818.882965] [<ffffffff816598de>] ? _raw_spin_lock+0xe/0x20
[ 818.882972] [<ffffffffa02d2300>] ? gfs2_glock_demote_wait+0x20/0x20 [gfs2]
[ 818.882975] [<ffffffff8165834c>] out_of_line_wait_on_bit+0x7c/0x90
[ 818.882978] [<ffffffff8108aa90>] ? autoremove_wake_function+0x40/0x40
[ 818.882985] [<ffffffffa02d4467>] gfs2_glock_wait+0x47/0x90 [gfs2]
[ 818.882992] [<ffffffffa02d5d48>] gfs2_glock_nq+0x318/0x440 [gfs2]
[ 818.882998] [<ffffffff81161cff>] ? kmem_cache_free+0x2f/0x110
[ 818.883007] [<ffffffffa02e3ccb>] gfs2_getattr+0xbb/0xf0 [gfs2]
[ 818.883015] [<ffffffffa02e3cc2>] ? gfs2_getattr+0xb2/0xf0 [gfs2]
[ 818.883020] [<ffffffff8117c79e>] vfs_getattr+0x4e/0x80
[ 818.883023] [<ffffffff8117c81e>] vfs_fstatat+0x4e/0x70
[ 818.883026] [<ffffffff8117c85e>] vfs_lstat+0x1e/0x20
[ 818.883029] [<ffffffff8117c9fa>] sys_newlstat+0x1a/0x40
[ 818.883033] [<ffffffff811971cf>] ? mntput+0x1f/0x30
[ 818.883036] [<ffffffff81182652>] ? path_put+0x22/0x30
[ 818.883039] [<ffffffff8119bc1b>] ? sys_lgetxattr+0x5b/0x70
[ 818.883042] [<ffffffff81661ec2>] system_call_fastpath+0x16/0x1b
What could be the problem?
Thank you.
Martin
More information about the Linux-cluster
mailing list