[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] gfs2 blocking tasks



Well nothing strange there either i think, nothing on for example vm01-test during that period, vm02-test gives me:


Aug 19 00:09:57 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 1760 Aug 19 00:09:57 vm02-test corosync[7394]: [CMAN ] daemon: sending reply 40000007 to fd 17 Aug 19 00:09:57 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:09:57 vm02-test corosync[7394]: [CMAN ] daemon: client command is 800000b7 Aug 19 00:09:57 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:09:57 vm02-test corosync[7394]: [CMAN ] memb: command to process is 800000b7 Aug 19 00:09:57 vm02-test corosync[7394]: [CMAN ] memb: command return code is 0 Aug 19 00:09:57 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 0 Aug 19 00:09:57 vm02-test corosync[7394]: [CMAN ] daemon: sending reply c00000b7 to fd 17 Aug 19 00:09:57 vm02-test corosync[7394]: [CMAN ] memb: quorum_device_timer_fn Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] daemon: client command is 7 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] memb: command to process is 7 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: allocated new buffer (retsize=1024) Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: retlen = 1760 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] memb: command return code is 4 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 1760 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] daemon: sending reply 40000007 to fd 17 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] daemon: client command is 800000b7 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] memb: command to process is 800000b7 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] memb: command return code is 0 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 0 Aug 19 00:09:58 vm02-test corosync[7394]: [CMAN ] daemon: sending reply c00000b7 to fd 17 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] daemon: client command is 7 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] memb: command to process is 7 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: allocated new buffer (retsize=1024) Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: retlen = 1760 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] memb: command return code is 4 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 1760 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] daemon: sending reply 40000007 to fd 17 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] daemon: client command is 800000b7 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] memb: command to process is 800000b7 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] memb: command return code is 0 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 0 Aug 19 00:09:59 vm02-test corosync[7394]: [CMAN ] daemon: sending reply c00000b7 to fd 17 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] daemon: client command is 7 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] memb: command to process is 7 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: allocated new buffer (retsize=1024) Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: retlen = 1760 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] memb: command return code is 4 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 1760 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] daemon: sending reply 40000007 to fd 17 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] daemon: client command is 800000b7 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] memb: command to process is 800000b7 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] memb: command return code is 0 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 0 Aug 19 00:10:00 vm02-test corosync[7394]: [CMAN ] daemon: sending reply c00000b7 to fd 17 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] daemon: client command is 7 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] memb: command to process is 7 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: allocated new buffer (retsize=1024) Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: retlen = 1760 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] memb: command return code is 4 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 1760 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] daemon: sending reply 40000007 to fd 17 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] daemon: client command is 800000b7 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] memb: command to process is 800000b7 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] memb: command return code is 0 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 0 Aug 19 00:10:01 vm02-test corosync[7394]: [CMAN ] daemon: sending reply c00000b7 to fd 17 Aug 19 00:10:01 vm02-test kernel: [282120.240067] INFO: task kworker/1:0:3117 blocked for more than 120 seconds. Aug 19 00:10:01 vm02-test kernel: [282120.240182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 19 00:10:01 vm02-test kernel: [282120.240296] kworker/1:0 D ffff88032fc93900 0 3117 2 0x00000000 Aug 19 00:10:01 vm02-test kernel: [282120.240302] ffff8802bb4dfb30 0000000000000046 ffff88031e4744d0 ffff8802bb4dffd8 Aug 19 00:10:01 vm02-test kernel: [282120.240307] ffff8802bb4dffd8 ffff8802bb4dffd8 ffff88031e5796f0 ffff88031e4744d0 Aug 19 00:10:01 vm02-test kernel: [282120.240311] 0000000000000286 ffff88032ffbd0f8 ffff8802bb4dfbc8 0000000000000002
Aug 19 00:10:01 vm02-test kernel: [282120.240316] Call Trace:
Aug 19 00:10:01 vm02-test kernel: [282120.240334] [<ffffffffa0570290>] ? gfs2_glock_demote_wait+0x20/0x20 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240340] [<ffffffff81666c89>] schedule+0x29/0x70 Aug 19 00:10:01 vm02-test kernel: [282120.240349] [<ffffffffa057029e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240352] [<ffffffff81665400>] __wait_on_bit+0x60/0x90 Aug 19 00:10:01 vm02-test kernel: [282120.240361] [<ffffffffa0570290>] ? gfs2_glock_demote_wait+0x20/0x20 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240364] [<ffffffff816654ac>] out_of_line_wait_on_bit+0x7c/0x90 Aug 19 00:10:01 vm02-test kernel: [282120.240369] [<ffffffff81073400>] ? autoremove_wake_function+0x40/0x40 Aug 19 00:10:01 vm02-test kernel: [282120.240378] [<ffffffffa05713a7>] wait_on_holder+0x47/0x80 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240388] [<ffffffffa05741d8>] gfs2_glock_nq+0x328/0x450 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240399] [<ffffffffa058a8ca>] gfs2_check_blk_type+0x4a/0x150 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240410] [<ffffffffa058a8c1>] ? gfs2_check_blk_type+0x41/0x150 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240421] [<ffffffffa058ba0c>] gfs2_evict_inode+0x2cc/0x360 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240432] [<ffffffffa058b842>] ? gfs2_evict_inode+0x102/0x360 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240437] [<ffffffff811940c2>] evict+0xb2/0x1b0 Aug 19 00:10:01 vm02-test kernel: [282120.240440] [<ffffffff811942c9>] iput+0x109/0x210 Aug 19 00:10:01 vm02-test kernel: [282120.240448] [<ffffffffa0572fdc>] delete_work_func+0x5c/0x90 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240453] [<ffffffff8106d5fa>] process_one_work+0x12a/0x420 Aug 19 00:10:01 vm02-test kernel: [282120.240462] [<ffffffffa0572f80>] ? gfs2_holder_uninit+0x40/0x40 [gfs2] Aug 19 00:10:01 vm02-test kernel: [282120.240465] [<ffffffff8106e19e>] worker_thread+0x12e/0x2f0 Aug 19 00:10:01 vm02-test kernel: [282120.240469] [<ffffffff8106e070>] ? manage_workers.isra.25+0x200/0x200 Aug 19 00:10:01 vm02-test kernel: [282120.240472] [<ffffffff81072e73>] kthread+0x93/0xa0 Aug 19 00:10:01 vm02-test kernel: [282120.240477] [<ffffffff816710a4>] kernel_thread_helper+0x4/0x10 Aug 19 00:10:01 vm02-test kernel: [282120.240480] [<ffffffff81072de0>] ? flush_kthread_worker+0x80/0x80 Aug 19 00:10:01 vm02-test kernel: [282120.240484] [<ffffffff816710a0>] ? gs_change+0x13/0x13 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] daemon: client command is 7 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] memb: command to process is 7 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: allocated new buffer (retsize=1024) Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: retlen = 1760 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] memb: command return code is 4 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 1760 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] daemon: sending reply 40000007 to fd 17 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] daemon: client command is 800000b7 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] memb: command to process is 800000b7 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] memb: command return code is 0 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 0 Aug 19 00:10:02 vm02-test corosync[7394]: [CMAN ] daemon: sending reply c00000b7 to fd 17 Aug 19 00:10:03 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17 Aug 19 00:10:03 vm02-test corosync[7394]: [CMAN ] daemon: client command is 7 Aug 19 00:10:03 vm02-test corosync[7394]: [CMAN ] daemon: About to process command Aug 19 00:10:03 vm02-test corosync[7394]: [CMAN ] memb: command to process is 7 Aug 19 00:10:03 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: allocated new buffer (retsize=1024) Aug 19 00:10:03 vm02-test corosync[7394]: [CMAN ] memb: get_all_members: retlen = 1760 Aug 19 00:10:03 vm02-test corosync[7394]: [CMAN ] memb: command return code is 4 Aug 19 00:10:03 vm02-test corosync[7394]: [CMAN ] daemon: Returning command data. length = 1760 Aug 19 00:10:03 vm02-test corosync[7394]: [CMAN ] daemon: sending reply 40000007 to fd 17 Aug 19 00:10:03 vm02-test corosync[7394]: [CMAN ] daemon: read 20 bytes from fd 17

Does this help in any way?

Kind regards,

Bart

Digimer schreef op 19.08.2012 22:24:
The config looks fine. Sorry, I should have specified syslog;
/var/log/messages.

On 08/19/2012 03:05 PM, Bart Verwilst wrote:
<cluster name="kvm" config_version="13">
    <logging debug="on"/>
        <clusternodes>
        <clusternode name="vm01-test" nodeid="1">
        <fence>
            <method name="apc">
                <device name="apc01" port="1" action="off"/>
                <device name="apc02" port="1" action="off"/>
                <device name="apc01" port="1" action="on"/>
                <device name="apc02" port="1" action="on"/>
            </method>
        </fence>
        </clusternode>
        <clusternode name="vm02-test" nodeid="2">
        <fence>
            <method name="apc">
                <device name="apc01" port="8" action="off"/>
                <device name="apc02" port="8" action="off"/>
                <device name="apc01" port="8" action="on"/>
                <device name="apc02" port="8" action="on"/>
            </method>
                </fence>
        </clusternode>
        <clusternode name="vm03-test" nodeid="3">
        <fence>
            <method name="apc">
                <device name="apc01" port="2" action="off"/>
                <device name="apc02" port="2" action="off"/>
                <device name="apc01" port="2" action="on"/>
                <device name="apc02" port="2" action="on"/>
            </method>
                </fence>
        </clusternode>
        </clusternodes>
    <fencedevices>
        <fencedevice agent="fence_apc" ipaddr="apc01" secure="on"
login="device" name="apc01" passwd="xxx"/>
        <fencedevice agent="fence_apc" ipaddr="apc02" secure="on"
login="device" name="apc02" passwd="xxx"/>
    </fencedevices>
    <rm log_level="5">
        <failoverdomains>
<failoverdomain name="any_node" nofailback="1" ordered="0"
restricted="0"/>
        </failoverdomains>
        <vm domain="any_node" max_restarts="2" migrate="live"
name="firewall" path="/etc/libvirt/qemu/" recovery="restart"
restart_expire_time="600"/>
        <vm domain="any_node" max_restarts="2" migrate="live"
name="zabbix" path="/etc/libvirt/qemu/" recovery="restart"
restart_expire_time="600"/>
    </rm>
    <totem rrp_mode="none" secauth="off"/>
    <quorumd device="/dev/mapper/iscsi_cluster_quorum"></quorumd>
</cluster>

dlm_control.log*, fenced.log, corosync.log are either empty or just
contain entries until Aug 15 or ( in case of fenced.log ) "Aug 18
23:41:46 fenced logging mode 3 syslog f 160 p 6 logfile p 7
/var/log/cluster/fenced.log". I was expecting at least some log output, since everything seems to work fine. Maybe enabling more debugging might
bring something up ( but debugging is already on @ cluster.conf )

Kind regards,

Bart

Digimer schreef op 19.08.2012 17:45:
On 08/19/2012 05:52 AM, Bart Verwilst wrote:
Hi,

I have a 3-node cluster in testing which seem to work quite well ( cman,
rgmanager, gfs2, etc ).
On (only) one of my nodes, yesterday i noticed the message below in
dmesg.

I saw this 30 minutes after the facts. I could browse both my gfs2
mounts, there was no fencing or anything on any node.

Any idea what might have caused this, and then go away?

Aug 19 00:10:01 vm02-test kernel: [282120.240067] INFO: task
kworker/1:0:3117 blocked for more than 120 seconds.
Aug 19 00:10:01 vm02-test kernel: [282120.240182] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 19 00:10:01 vm02-test kernel: [282120.240296] kworker/1:0 D
ffff88032fc93900     0  3117      2 0x00000000
Aug 19 00:10:01 vm02-test kernel: [282120.240302] ffff8802bb4dfb30
0000000000000046 ffff88031e4744d0 ffff8802bb4dffd8
Aug 19 00:10:01 vm02-test kernel: [282120.240307] ffff8802bb4dffd8
ffff8802bb4dffd8 ffff88031e5796f0 ffff88031e4744d0
Aug 19 00:10:01 vm02-test kernel: [282120.240311] 0000000000000286
ffff88032ffbd0f8 ffff8802bb4dfbc8 0000000000000002
Aug 19 00:10:01 vm02-test kernel: [282120.240316] Call Trace:
Aug 19 00:10:01 vm02-test kernel: [282120.240334] [<ffffffffa0570290>]
? gfs2_glock_demote_wait+0x20/0x20 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240340] [<ffffffff81666c89>]
schedule+0x29/0x70
Aug 19 00:10:01 vm02-test kernel: [282120.240349] [<ffffffffa057029e>]
gfs2_glock_holder_wait+0xe/0x20 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240352] [<ffffffff81665400>]
__wait_on_bit+0x60/0x90
Aug 19 00:10:01 vm02-test kernel: [282120.240361] [<ffffffffa0570290>]
? gfs2_glock_demote_wait+0x20/0x20 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240364] [<ffffffff816654ac>]
out_of_line_wait_on_bit+0x7c/0x90
Aug 19 00:10:01 vm02-test kernel: [282120.240369] [<ffffffff81073400>]
? autoremove_wake_function+0x40/0x40
Aug 19 00:10:01 vm02-test kernel: [282120.240378] [<ffffffffa05713a7>]
wait_on_holder+0x47/0x80 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240388] [<ffffffffa05741d8>]
gfs2_glock_nq+0x328/0x450 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240399] [<ffffffffa058a8ca>]
gfs2_check_blk_type+0x4a/0x150 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240410] [<ffffffffa058a8c1>]
? gfs2_check_blk_type+0x41/0x150 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240421] [<ffffffffa058ba0c>]
gfs2_evict_inode+0x2cc/0x360 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240432] [<ffffffffa058b842>]
? gfs2_evict_inode+0x102/0x360 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240437] [<ffffffff811940c2>]
evict+0xb2/0x1b0
Aug 19 00:10:01 vm02-test kernel: [282120.240440] [<ffffffff811942c9>]
iput+0x109/0x210
Aug 19 00:10:01 vm02-test kernel: [282120.240448] [<ffffffffa0572fdc>]
delete_work_func+0x5c/0x90 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240453] [<ffffffff8106d5fa>]
process_one_work+0x12a/0x420
Aug 19 00:10:01 vm02-test kernel: [282120.240462] [<ffffffffa0572f80>]
? gfs2_holder_uninit+0x40/0x40 [gfs2]
Aug 19 00:10:01 vm02-test kernel: [282120.240465] [<ffffffff8106e19e>]
worker_thread+0x12e/0x2f0
Aug 19 00:10:01 vm02-test kernel: [282120.240469] [<ffffffff8106e070>]
? manage_workers.isra.25+0x200/0x200
Aug 19 00:10:01 vm02-test kernel: [282120.240472] [<ffffffff81072e73>]
kthread+0x93/0xa0
Aug 19 00:10:01 vm02-test kernel: [282120.240477] [<ffffffff816710a4>]
kernel_thread_helper+0x4/0x10
Aug 19 00:10:01 vm02-test kernel: [282120.240480] [<ffffffff81072de0>]
? flush_kthread_worker+0x80/0x80
Aug 19 00:10:01 vm02-test kernel: [282120.240484] [<ffffffff816710a0>]
? gs_change+0x13/0x13
Aug 19 00:12:01 vm02-test kernel: [282240.240061] INFO: task
kworker/1:0:3117 blocked for more than 120 seconds.
Aug 19 00:12:01 vm02-test kernel: [282240.240175] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 19 00:12:01 vm02-test kernel: [282240.240289] kworker/1:0 D
ffff88032fc93900     0  3117      2 0x00000000
Aug 19 00:12:01 vm02-test kernel: [282240.240294] ffff8802bb4dfb30
0000000000000046 ffff88031e4744d0 ffff8802bb4dffd8
Aug 19 00:12:01 vm02-test kernel: [282240.240299] ffff8802bb4dffd8
ffff8802bb4dffd8 ffff88031e5796f0 ffff88031e4744d0
Aug 19 00:12:01 vm02-test kernel: [282240.240304] 0000000000000286
ffff88032ffbd0f8 ffff8802bb4dfbc8 0000000000000002
Aug 19 00:12:01 vm02-test kernel: [282240.240309] Call Trace:
Aug 19 00:12:01 vm02-test kernel: [282240.240326] [<ffffffffa0570290>]
? gfs2_glock_demote_wait+0x20/0x20 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240332] [<ffffffff81666c89>]
schedule+0x29/0x70
Aug 19 00:12:01 vm02-test kernel: [282240.240341] [<ffffffffa057029e>]
gfs2_glock_holder_wait+0xe/0x20 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240345] [<ffffffff81665400>]
__wait_on_bit+0x60/0x90
Aug 19 00:12:01 vm02-test kernel: [282240.240353] [<ffffffffa0570290>]
? gfs2_glock_demote_wait+0x20/0x20 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240357] [<ffffffff816654ac>]
out_of_line_wait_on_bit+0x7c/0x90
Aug 19 00:12:01 vm02-test kernel: [282240.240362] [<ffffffff81073400>]
? autoremove_wake_function+0x40/0x40
Aug 19 00:12:01 vm02-test kernel: [282240.240371] [<ffffffffa05713a7>]
wait_on_holder+0x47/0x80 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240380] [<ffffffffa05741d8>]
gfs2_glock_nq+0x328/0x450 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240391] [<ffffffffa058a8ca>]
gfs2_check_blk_type+0x4a/0x150 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240402] [<ffffffffa058a8c1>]
? gfs2_check_blk_type+0x41/0x150 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240413] [<ffffffffa058ba0c>]
gfs2_evict_inode+0x2cc/0x360 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240424] [<ffffffffa058b842>]
? gfs2_evict_inode+0x102/0x360 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240429] [<ffffffff811940c2>]
evict+0xb2/0x1b0
Aug 19 00:12:01 vm02-test kernel: [282240.240432] [<ffffffff811942c9>]
iput+0x109/0x210
Aug 19 00:12:01 vm02-test kernel: [282240.240440] [<ffffffffa0572fdc>]
delete_work_func+0x5c/0x90 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240445] [<ffffffff8106d5fa>]
process_one_work+0x12a/0x420
Aug 19 00:12:01 vm02-test kernel: [282240.240454] [<ffffffffa0572f80>]
? gfs2_holder_uninit+0x40/0x40 [gfs2]
Aug 19 00:12:01 vm02-test kernel: [282240.240458] [<ffffffff8106e19e>]
worker_thread+0x12e/0x2f0
Aug 19 00:12:01 vm02-test kernel: [282240.240462] [<ffffffff8106e070>]
? manage_workers.isra.25+0x200/0x200
Aug 19 00:12:01 vm02-test kernel: [282240.240465] [<ffffffff81072e73>]
kthread+0x93/0xa0
Aug 19 00:12:01 vm02-test kernel: [282240.240469] [<ffffffff816710a4>]
kernel_thread_helper+0x4/0x10
Aug 19 00:12:01 vm02-test kernel: [282240.240473] [<ffffffff81072de0>]
? flush_kthread_worker+0x80/0x80
Aug 19 00:12:01 vm02-test kernel: [282240.240476] [<ffffffff816710a0>]
? gs_change+0x13/0x13
<snip, goes on for a while>

Kind regards,

Bart

I usually see this when DLM is blocked. DLM usually blocks on a failed
fence action.

To clarify; this comes up on one of the three nodes only? On the node with these messages, you shouldn't be able to look at the hung FS on the
effected node.

Can you share you versions and cluster.conf please? Also, what is in the logs in the three or four minutes before these messages start? Anything interesting in the log files of the other nodes around the same time
period?

digimer



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]