[Nbd] nbd hanging doing an svnadmin hotcopy
Hi,
I'm testing out using nbd to host svn repositories because nfs doesn't
work. However, when tyring to do an svnadmin hotcopy the whole thing
has hung. I can't even kill -9 the processes involved, so I'm assuming
it's in the kernel.
I thought it might be a TCP timeout issue, so I've waited long past the
15 minute TCP timeout (read that somewhere), but not one bit has changed
in the call traces of the processes during that time.
It's a Debian Etch system, running the latest (2.6.18-5) kernel with
nbd-client 2.8.7 userland. I've pulled the call traces of the offending
process if that helps anyone. If I'm asking for help in the wrong
place, let me know so I can try there. However if anyone has any
experience with this, it would be greatly appreciated.
Thanks very much in advance for your help.
Cheers,
Eric
Output of px ax | grep D:
-------------------------
PID TTY STAT TIME COMMAND
11255 ? D< 0:00 [kjournald]
11790 pts/1 D+ 0:01 svnadmin hotcopy /srv/svn/...
11814 pts/2 D+ 0:00 du -s /srv/svn clubs
11920 pts/4 D+ 0:00 ls
12080 pts/3 S+ 0:00 grep D
Call traces of those process plus the nbd-client:
-------------------------------------------------
nbd-client S 00000001 0 11253 1 2774 (NOTLB)
c1dafacc 00000082 5380f91a 00000001 00000001 0000000a c4b07550
c766daa0
01b614f2 00018416 00006c6e 00000000 c4b07660 c11036e0 00000286
00000286
c4cb0980 00000246 00000001 c53d5b80 c4cb0980 c0253973 c45f1c34
7fffffff
Call Trace:
[<c0253973>] tcp_rcv_established+0x75e/0x7b7
[<c027f78a>] schedule_timeout+0x13/0x8c
[<c028085f>] _spin_lock_bh+0x8/0x18
[<c022216f>] release_sock+0xc/0x91
[<c02228c2>] sk_wait_data+0x67/0x98
[<c012d92d>] autoremove_wake_function+0x0/0x2d
[<c024d06a>] tcp_recvmsg+0x397/0x9e9
[<c0221c46>] sock_common_recvmsg+0x2f/0x45
[<c021ff1e>] sock_recvmsg+0xe5/0x100
[<c012d92d>] autoremove_wake_function+0x0/0x2d
[<c0116412>] __activate_task+0x1c/0x29
[<c011776e>] try_to_wake_up+0x355/0x35f
[<c012d93a>] autoremove_wake_function+0xd/0x2d
[<c02219db>] kernel_recvmsg+0x2b/0x3a
[<c8a1b123>] sock_xmit+0x123/0x22c [nbd]
[<c015cf25>] end_bio_bh_io_sync+0x0/0x39
[<c012d912>] __wake_up_bit+0x29/0x2e
[<c0143f7c>] mempool_free+0x5f/0x63
[<c015cf25>] end_bio_bh_io_sync+0x0/0x39
[<c015e552>] bio_put+0x28/0x29
[<c015cf5a>] end_bio_bh_io_sync+0x35/0x39
[<c015e6fd>] bio_endio+0x50/0x55
[<c0143f7c>] mempool_free+0x5f/0x63
[<c8a1b99a>] nbd_ioctl+0x32e/0x684 [nbd]
[<c0146b38>] __do_page_cache_readahead+0x69/0x1e8
[<c88f5272>] __journal_file_buffer+0x10e/0x1e3 [jbd]
[<c0146cfd>] blockable_page_cache_readahead+0x46/0x99
[<c0146dc0>] make_ahead_window+0x70/0x8d
[<c01214ab>] current_fs_time+0x4a/0x53
[<c016fc2a>] touch_atime+0x60/0x92
[<c01afd93>] blkdev_driver_ioctl+0x4b/0x5b
[<c01b03a9>] blkdev_ioctl+0x606/0x655
[<c0145841>] __alloc_pages+0x4e/0x275
[<c014170b>] find_get_page+0x18/0x38
[<c0143ccd>] filemap_nopage+0x19c/0x313
[<c014c2ee>] __handle_mm_fault+0x408/0x740
[<c01154b6>] do_page_fault+0x18a/0x481
[<c016041d>] block_ioctl+0x13/0x16
[<c016040a>] block_ioctl+0x0/0x16
[<c0169268>] do_ioctl+0x1c/0x5d
[<c01694f3>] vfs_ioctl+0x24a/0x25c
[<c016954d>] sys_ioctl+0x48/0x5f
[<c0102c11>] sysenter_past_esp+0x56/0x79
svnadmin D 00000000 0 11790 11777 (NOTLB)
c7a45dcc 00200082 5fed7199 00000000 c8938030 00000001 c65e8550
c02c66a0
01a2358a 00018416 00012c23 00000000 c65e8660 c11036e0 c014743b
00001000
00000000 c7a45da8 c7a45da8 c7a45e58 000000ff 00000000 00000000
c11036e0
Call Trace:
[<c014743b>] __pagevec_lru_add+0x90/0x9b
[<c027f65c>] io_schedule+0x26/0x30
[<c0141782>] sync_page+0x0/0x40
[<c01417bf>] sync_page+0x3d/0x40
[<c027f857>] __wait_on_bit_lock+0x2a/0x52
[<c014177c>] __lock_page+0x51/0x57
[<c012d95a>] wake_bit_function+0x0/0x3c
[<c0141e32>] do_generic_mapping_read+0x1c9/0x42a
[<c01428d2>] __generic_file_aio_read+0x16b/0x1b2
[<c01415bb>] file_read_actor+0x0/0xca
[<c0142954>] generic_file_aio_read+0x3b/0x42
[<c0159e13>] do_sync_read+0xb6/0xf1
[<c012d92d>] autoremove_wake_function+0x0/0x2d
[<c0159d5d>] do_sync_read+0x0/0xf1
[<c015a71c>] vfs_read+0x9f/0x141
[<c015ab68>] sys_read+0x3c/0x63
[<c0102c11>] sysenter_past_esp+0x56/0x79
du D 00000000 0 11814 11795 (NOTLB)
c42c3e5c 00000086 037d7455 00000000 c1ea13ac 00000008 c3c55550
c02c66a0
61dec800 00018432 00249f24 00000000 c3c55660 c11036e0 00000000
00000004
c015bacf c77b196c c3b71c34 c10ee840 000000ff 00000000 00000000
c11036e0
Call Trace:
[<c015bacf>] __find_get_block_slow+0xfb/0x105
[<c027f65c>] io_schedule+0x26/0x30
[<c015c546>] sync_buffer+0x0/0x33
[<c015c576>] sync_buffer+0x30/0x33
[<c027f857>] __wait_on_bit_lock+0x2a/0x52
[<c015c546>] sync_buffer+0x0/0x33
[<c027f8e1>] out_of_line_wait_on_bit_lock+0x62/0x6a
[<c012d95a>] wake_bit_function+0x0/0x3c
[<c015c698>] __lock_buffer+0x21/0x24
[<c88f56d2>] do_get_write_access+0x4c/0x462 [jbd]
[<c892a4cf>] __ext3_get_inode_loc+0x109/0x2b9 [ext3]
[<c88f5b00>] journal_get_write_access+0x18/0x26 [jbd]
[<c892a6bf>] ext3_reserve_inode_write+0x2f/0x78 [ext3]
[<c892a719>] ext3_mark_inode_dirty+0x11/0x27 [ext3]
[<c892d06a>] ext3_dirty_inode+0x53/0x66 [ext3]
[<c0176f11>] __mark_inode_dirty+0x27/0x15a
[<c016fc2a>] touch_atime+0x60/0x92
[<c0169733>] vfs_readdir+0x76/0x8d
[<c0169564>] filldir64+0x0/0xc3
[<c01697ad>] sys_getdents64+0x63/0xa5
[<c0102c11>] sysenter_past_esp+0x56/0x79
ls D 00000000 0 11920 11902 (NOTLB)
c71d3f54 00000086 018f610b 00000000 c02ccbe8 00000008 c4a8d000
c02c66a0
9d38226e 00018485 00601f9e 00000000 c4a8d110 c11036e0 00001000
00000000
01b11067 c1036220 c10582ac c2c15180 000000ff 00000000 00000000
c72ffb28
Call Trace:
[<c027faff>] __mutex_lock_slowpath+0x4a/0x79
[<c027fb33>] .text.lock.mutex+0x5/0x14
[<c0169708>] vfs_readdir+0x4b/0x8d
[<c0169564>] filldir64+0x0/0xc3
[<c01697ad>] sys_getdents64+0x63/0xa5
[<c0102c11>] sysenter_past_esp+0x56/0x79
--
Eric Gerlach, Network Administrator
Federation of Students
University of Waterloo
p: (519) 888-4567 x36329
e: egerlach@...135...
Reply to: