Hi!I've just tried to experiment with dm snapshot target to test a tool I've recently written for ext4 filesystem (it resizes inode tables without recreating the fs). Underlying device is /dev/sdc1 on a SATA hard drive, COW device is a loop device connected to a sparse file located on an ext4 /home filesystem.
At first there was no problem, the read/write performance was good. But then my tool started to move inode tables (~ 4.5 GB in total on a 3 TB partition) and hanged in 'D' state - I observe it at the moment, it already lasts for almost 2 hours :) 'sync' executed in another console also hangs. There are messages in dmesg:
[654255.339515] INFO: task sync:28318 blocked for more than 120 seconds. [654255.339522] Tainted: G W 3.12-trunk-amd64 #1[654255.339525] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [654255.339528] sync D ffff8800a7de8b00 0 28318 27636 0x20020000 [654255.339534] ffff8800a7de87c0 0000000000000082 00000000000142c0 ffff88008c3e3fd8 [654255.339539] 00000000000142c0 ffff88008c3e3fd8 ffff88011fc14b00 ffff88011ffce908 [654255.339543] 0000000000000002 ffffffff8110f530 ffff88008c3e3e20 000000002baa1371
[654255.339548] Call Trace: [654255.339558] [<ffffffff8110f530>] ? wait_on_page_read+0x60/0x60 [654255.339564] [<ffffffff81489194>] ? io_schedule+0x94/0x120 [654255.339568] [<ffffffff8110f535>] ? sleep_on_page+0x5/0x10 [654255.339574] [<ffffffff814871d4>] ? __wait_on_bit+0x54/0x80 [654255.339578] [<ffffffff8110f34f>] ? wait_on_page_bit+0x7f/0x90 [654255.339583] [<ffffffff8107a440>] ? wake_atomic_t_function+0x30/0x30 [654255.339589] [<ffffffff8111bc28>] ? pagevec_lookup_tag+0x18/0x20 [654255.339593] [<ffffffff8110f438>] ? filemap_fdatawait_range+0xd8/0x150 [654255.339598] [<ffffffff8119bf30>] ? sync_inodes_one_sb+0x20/0x20 [654255.339603] [<ffffffff811a62ce>] ? iterate_bdevs+0xbe/0xf0 [654255.339607] [<ffffffff8119c199>] ? sys_sync+0x69/0x90Interesting there are no such messages about my tool, although it's still in D state and I can't interrupt it using any signal.
At the same time the system overall is still responsive, and it's clearly seen that some work is being done - the place occupied by my sparse file which is the snapshot device grows very slowly - something about 1-2 megabytes in a minute. There's massive I/O wait (~60% cpu), NO user (0%) and NO system (0%) CPU usage, and almost NO real disk input/output - 'iostat 1' shows mostly zeroes, and only sometimes it shows some small values 'written' on a device with that sparse file.
'slabtop' shows very big, and roughly equal amounts of bio-0, bio-1, dm_io and buffer_head objects. kmalloc-32, dm_snap_pending_exception and kcopyd_job are less, but also close to 100%. Most of these numbers change slightly over time, except for kcopyd_job that is stuck at 77490. It looks like the following:
1311206 634877 48% 0,10K 35438 37 141752K buffer_head 621400 621304 99% 0,19K 31070 20 124280K bio-0 621184 621080 99% 0,04K 6752 92 27008K dm_io 621090 621080 99% 0,25K 41406 15 165624K bio-1 88816 86981 97% 0,03K 793 112 3172K kmalloc-3277959 77889 99% 0,10K 2107 37 8428K dm_snap_pending_exception
77490 77490 100% 3,23K 38745 2 309960K kcopyd_job (full /proc/slabinfo is attached)'dmsetup status' prints the following and the output does not change over time:
sdc1ovl: 0 5860531087 snapshot 806488/5860531087 728I've googled for lvm snapshot performance problems and there are many threads about it, but as I understand something similar to my case was only in a 2005 thread here:
http://www.redhat.com/archives/dm-devel/2005-June/msg00126.html And also maybe in another thread here: http://www.redhat.com/archives/dm-devel/2006-September/msg00078.htmlWhat do you think this problem could be? Thanks in advance to anyone who answers :-)
-- With best regards, Vitaliy Filippov
Description: Binary data