Bug#624791: reiserfs woes on huge write session
Package: linux-image-2.6.32-5-amd64
Version: 2.6.32-31
I've got a huge file server with about 3.5 TByte disk space (reiserfs),
64 GByte RAM and 2 quadcore CPUs. Trying to copy (rsync) about 2 TByte
data to this disk it takes a few minutes, the load goes up to 6 or
higher, then there is this message:
May 1 14:50:04 srvl011 kernel: [ 1081.086183] reiserfs/1 D 0000000000000000 0 334 2 0x00000000
May 1 14:50:04 srvl011 kernel: [ 1081.086188] ffff881029cb3880 0000000000000046 0000000000000000 ffffffff812fae54
May 1 14:50:04 srvl011 kernel: [ 1081.086191] 0000000000000000 0000000000000046 000000000000f9e0 ffff8810262b5fd8
May 1 14:50:04 srvl011 kernel: [ 1081.086193] 0000000000015780 0000000000015780 ffff881026fa5bd0 ffff881026fa5ec8
May 1 14:50:04 srvl011 kernel: [ 1081.086195] Call Trace:
May 1 14:50:04 srvl011 kernel: [ 1081.086205] [<ffffffff812fae54>] ? thread_return+0x8d/0xe0
May 1 14:50:04 srvl011 kernel: [ 1081.086208] [<ffffffff812fb65b>] ? __mutex_lock_common+0x122/0x192
May 1 14:50:04 srvl011 kernel: [ 1081.086210] [<ffffffff812fb783>] ? mutex_lock+0x1a/0x31
May 1 14:50:04 srvl011 kernel: [ 1081.086232] [<ffffffffa01682da>] ? flush_commit_list+0x146/0x5d0 [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086241] [<ffffffffa016828a>] ? flush_commit_list+0xf6/0x5d0 [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086249] [<ffffffffa016883d>] ? flush_async_commits+0x42/0x4b [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086258] [<ffffffff8106186b>] ? worker_thread+0x188/0x21d
May 1 14:50:04 srvl011 kernel: [ 1081.086266] [<ffffffffa01687fb>] ? flush_async_commits+0x0/0x4b [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086272] [<ffffffff81064e96>] ? autoremove_wake_function+0x0/0x2e
May 1 14:50:04 srvl011 kernel: [ 1081.086278] [<ffffffff810616e3>] ? worker_thread+0x0/0x21d
May 1 14:50:04 srvl011 kernel: [ 1081.086282] [<ffffffff81064bc9>] ? kthread+0x79/0x81
May 1 14:50:04 srvl011 kernel: [ 1081.086289] [<ffffffff81011baa>] ? child_rip+0xa/0x20
May 1 14:50:04 srvl011 kernel: [ 1081.086294] [<ffffffff81064b50>] ? kthread+0x0/0x81
May 1 14:50:04 srvl011 kernel: [ 1081.086298] [<ffffffff81011ba0>] ? child_rip+0x0/0x20
May 1 14:50:04 srvl011 kernel: [ 1081.086310] reiserfs/3 D 0000000000000000 0 336 2 0x00000000
May 1 14:50:04 srvl011 kernel: [ 1081.086317] ffff881029cb7100 0000000000000046 ffffffff813c7097 ffff881026237da5
May 1 14:50:04 srvl011 kernel: [ 1081.086325] ffff881026237d98 0000000000000006 000000000000f9e0 ffff881026237fd8
May 1 14:50:04 srvl011 kernel: [ 1081.086332] 0000000000015780 0000000000015780 ffff881026fa62e0 ffff881026fa65d8
May 1 14:50:04 srvl011 kernel: [ 1081.086340] Call Trace:
May 1 14:50:04 srvl011 kernel: [ 1081.086345] [<ffffffff812fb65b>] ? __mutex_lock_common+0x122/0x192
May 1 14:50:04 srvl011 kernel: [ 1081.086350] [<ffffffff812fb783>] ? mutex_lock+0x1a/0x31
May 1 14:50:04 srvl011 kernel: [ 1081.086359] [<ffffffffa01682da>] ? flush_commit_list+0x146/0x5d0 [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086368] [<ffffffffa016828a>] ? flush_commit_list+0xf6/0x5d0 [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086376] [<ffffffffa016883d>] ? flush_async_commits+0x42/0x4b [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086382] [<ffffffff8106186b>] ? worker_thread+0x188/0x21d
May 1 14:50:04 srvl011 kernel: [ 1081.086390] [<ffffffffa01687fb>] ? flush_async_commits+0x0/0x4b [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086396] [<ffffffff81064e96>] ? autoremove_wake_function+0x0/0x2e
May 1 14:50:04 srvl011 kernel: [ 1081.086401] [<ffffffff810616e3>] ? worker_thread+0x0/0x21d
May 1 14:50:04 srvl011 kernel: [ 1081.086407] [<ffffffff81064bc9>] ? kthread+0x79/0x81
May 1 14:50:04 srvl011 kernel: [ 1081.086412] [<ffffffff81011baa>] ? child_rip+0xa/0x20
May 1 14:50:04 srvl011 kernel: [ 1081.086417] [<ffffffff810616e3>] ? worker_thread+0x0/0x21d
May 1 14:50:04 srvl011 kernel: [ 1081.086422] [<ffffffff81064b50>] ? kthread+0x0/0x81
May 1 14:50:04 srvl011 kernel: [ 1081.086427] [<ffffffff81011ba0>] ? child_rip+0x0/0x20
May 1 14:50:04 srvl011 kernel: [ 1081.086446] rsync D 0000000000000000 0 1927 1926 0x00000000
May 1 14:50:04 srvl011 kernel: [ 1081.086452] ffff881029cb7100 0000000000000082 0000000000000000 0000000000000000
May 1 14:50:04 srvl011 kernel: [ 1081.086461] ffff880000056a08 0000000000000001 000000000000f9e0 ffff880f5806bfd8
May 1 14:50:04 srvl011 kernel: [ 1081.086468] 0000000000015780 0000000000015780 ffff88102640b170 ffff88102640b468
May 1 14:50:04 srvl011 kernel: [ 1081.086476] Call Trace:
May 1 14:50:04 srvl011 kernel: [ 1081.086481] [<ffffffff812fb65b>] ? __mutex_lock_common+0x122/0x192
May 1 14:50:04 srvl011 kernel: [ 1081.086486] [<ffffffff812fb783>] ? mutex_lock+0x1a/0x31
May 1 14:50:04 srvl011 kernel: [ 1081.086494] [<ffffffff8110d62d>] ? __find_get_block+0x176/0x186
May 1 14:50:04 srvl011 kernel: [ 1081.086502] [<ffffffffa01682da>] ? flush_commit_list+0x146/0x5d0 [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086510] [<ffffffff8105a834>] ? lock_timer_base+0x26/0x4b
May 1 14:50:04 srvl011 kernel: [ 1081.086515] [<ffffffff8105add6>] ? __mod_timer+0x141/0x153
May 1 14:50:04 srvl011 kernel: [ 1081.086523] [<ffffffffa01687cc>] ? get_list_bitmap+0x68/0x97 [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086531] [<ffffffffa016a66f>] ? do_journal_end+0xa42/0xcf5 [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086540] [<ffffffffa0155efe>] ? reiserfs_update_sd_size+0x2ba/0x2cc [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086549] [<ffffffffa016ab11>] ? reiserfs_end_persistent_transaction+0x21/0x4a [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086558] [<ffffffffa01560c5>] ? reiserfs_write_end+0x1b5/0x22b [reiserfs]
May 1 14:50:04 srvl011 kernel: [ 1081.086567] [<ffffffff810b4e09>] ? generic_file_buffered_write+0x18d/0x278
May 1 14:50:04 srvl011 kernel: [ 1081.086573] [<ffffffff810b52a5>] ? __generic_file_aio_write+0x25f/0x293
May 1 14:50:04 srvl011 kernel: [ 1081.086580] [<ffffffff810f5cc9>] ? pipe_read+0x39c/0x3af
May 1 14:50:04 srvl011 kernel: [ 1081.086585] [<ffffffff810b5332>] ? generic_file_aio_write+0x59/0x9f
May 1 14:50:04 srvl011 kernel: [ 1081.086592] [<ffffffff810ee966>] ? do_sync_write+0xce/0x113
May 1 14:50:04 srvl011 kernel: [ 1081.086597] [<ffffffff81064e96>] ? autoremove_wake_function+0x0/0x2e
May 1 14:50:04 srvl011 kernel: [ 1081.086604] [<ffffffff8106c403>] ? ktime_get_ts+0x68/0xb2
May 1 14:50:04 srvl011 kernel: [ 1081.086610] [<ffffffff810ef2b8>] ? vfs_write+0xa9/0x102
May 1 14:50:04 srvl011 kernel: [ 1081.086616] [<ffffffff810ef3cd>] ? sys_write+0x45/0x6e
May 1 14:50:04 srvl011 kernel: [ 1081.086621] [<ffffffff810114ce>] ? common_interrupt+0xe/0x13
May 1 14:50:04 srvl011 kernel: [ 1081.086627] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
After that the rsync session is stuck, I cannot kill it, "sync" gets
stuck, too, etc. All I can do is to press reset.
Using xfs instead of reiserfs there is no such problem.
Regards
Harri
Reply to: