[rhelv6-list] Uninterruptable sleep when accessing nfs

Eiríkur Hjartarson Eirikur.Hjartarson at decode.is
Thu Nov 24 12:44:47 UTC 2011


Hi all,

We are having a bunch of processes going into uninterruptable sleep when accessing nfs-mounted files.
Sometimes the processes recover, but sometimes they don't and we must reboot the machine.
This is probably associated with quite high load on the NFS server.  Below is an example of what we see in the kernel logs.

We are running RHEL 6.1, kernel version 2.6.32-131.17.1.el6.x86_64.

My question is; is anyone else running RHEL 6.1 seeing these problems and are there any solutions?
(There are several reports on the net about similar problems with kernels released within the last year, but I have found no solutions.)

---------------------------------------------------------------------------------------------------------------------------------------------
Nov 24 02:05:04 lclcx487 kernel: INFO: task java:12278 blocked for more than 120 seconds.
Nov 24 02:05:04 lclcx487 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 02:05:04 lclcx487 kernel: java          D 0000000000000008     0 12278  12269 0x00000080
Nov 24 02:05:04 lclcx487 kernel: ffff88078bf05c78 0000000000000082 ffff88078bf05bf8 ffffffffa0257ca9
Nov 24 02:05:04 lclcx487 kernel: ffff8805d7c75c00 ffff880969800340 ffff8806872feb60 ffff8805d7c75c08
Nov 24 02:05:04 lclcx487 kernel: ffff88096931ba78 ffff88078bf05fd8 000000000000f598 ffff88096931ba78
Nov 24 02:05:04 lclcx487 kernel: Call Trace:
Nov 24 02:05:04 lclcx487 kernel: [<ffffffffa0257ca9>] ? rpc_run_task+0xd9/0x130 [sunrpc]
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff81098d19>] ? ktime_get_ts+0xa9/0xe0
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8110d3d0>] ? sync_page+0x0/0x50
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff814db743>] io_schedule+0x73/0xc0
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8110d40d>] sync_page+0x3d/0x50
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff814dbfaf>] __wait_on_bit+0x5f/0x90
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8110d5c3>] wait_on_page_bit+0x73/0x80
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8108e1c0>] ? wake_bit_function+0x0/0x50
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff811232d5>] ? pagevec_lookup_tag+0x25/0x40
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8110d9db>] wait_on_page_writeback_range+0xfb/0x190
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8110dba8>] filemap_write_and_wait_range+0x78/0x90
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff811a0abe>] vfs_fsync_range+0x7e/0xe0
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff811a0b8d>] vfs_fsync+0x1d/0x20
Nov 24 02:05:04 lclcx487 kernel: [<ffffffffa0309410>] nfs_file_flush+0x70/0xa0 [nfs]
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8116f7cc>] filp_close+0x3c/0x90
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8116f8c5>] sys_close+0xa5/0x100
Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
---------------------------------------------------------------------------------------------------------------------------------------------

Regards,
-- 
Eiríkur Hjartarson      E-mail: Eirikur.Hjartarson at decode.is
deCODE genetics         Mobile: +3546641898
Sturlugötu 7
IS-101 Reykjavík






More information about the rhelv6-list mailing list