[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[Nbd] nbd hanging doing an svnadmin hotcopy



Hi,

I'm testing out using nbd to host svn repositories because nfs doesn't work. However, when tyring to do an svnadmin hotcopy the whole thing has hung. I can't even kill -9 the processes involved, so I'm assuming it's in the kernel.

I thought it might be a TCP timeout issue, so I've waited long past the 15 minute TCP timeout (read that somewhere), but not one bit has changed in the call traces of the processes during that time.

It's a Debian Etch system, running the latest (2.6.18-5) kernel with nbd-client 2.8.7 userland. I've pulled the call traces of the offending process if that helps anyone. If I'm asking for help in the wrong place, let me know so I can try there. However if anyone has any experience with this, it would be greatly appreciated.

Thanks very much in advance for your help.

Cheers,

Eric

Output of px ax | grep D:
-------------------------

  PID TTY      STAT   TIME COMMAND
11255 ?        D<     0:00 [kjournald]
11790 pts/1    D+     0:01 svnadmin hotcopy /srv/svn/...
11814 pts/2    D+     0:00 du -s /srv/svn clubs
11920 pts/4    D+     0:00 ls
12080 pts/3    S+     0:00 grep D

Call traces of those process plus the nbd-client:
-------------------------------------------------

 nbd-client    S 00000001     0 11253      1                2774 (NOTLB)
c1dafacc 00000082 5380f91a 00000001 00000001 0000000a c4b07550 c766daa0 01b614f2 00018416 00006c6e 00000000 c4b07660 c11036e0 00000286 00000286 c4cb0980 00000246 00000001 c53d5b80 c4cb0980 c0253973 c45f1c34 7fffffff
 Call Trace:
  [<c0253973>] tcp_rcv_established+0x75e/0x7b7
  [<c027f78a>] schedule_timeout+0x13/0x8c
  [<c028085f>] _spin_lock_bh+0x8/0x18
  [<c022216f>] release_sock+0xc/0x91
  [<c02228c2>] sk_wait_data+0x67/0x98
  [<c012d92d>] autoremove_wake_function+0x0/0x2d
  [<c024d06a>] tcp_recvmsg+0x397/0x9e9
  [<c0221c46>] sock_common_recvmsg+0x2f/0x45
  [<c021ff1e>] sock_recvmsg+0xe5/0x100
  [<c012d92d>] autoremove_wake_function+0x0/0x2d
  [<c0116412>] __activate_task+0x1c/0x29
  [<c011776e>] try_to_wake_up+0x355/0x35f
  [<c012d93a>] autoremove_wake_function+0xd/0x2d
  [<c02219db>] kernel_recvmsg+0x2b/0x3a
  [<c8a1b123>] sock_xmit+0x123/0x22c [nbd]
  [<c015cf25>] end_bio_bh_io_sync+0x0/0x39
  [<c012d912>] __wake_up_bit+0x29/0x2e
  [<c0143f7c>] mempool_free+0x5f/0x63
  [<c015cf25>] end_bio_bh_io_sync+0x0/0x39
  [<c015e552>] bio_put+0x28/0x29
  [<c015cf5a>] end_bio_bh_io_sync+0x35/0x39
  [<c015e6fd>] bio_endio+0x50/0x55
  [<c0143f7c>] mempool_free+0x5f/0x63
  [<c8a1b99a>] nbd_ioctl+0x32e/0x684 [nbd]
  [<c0146b38>] __do_page_cache_readahead+0x69/0x1e8
  [<c88f5272>] __journal_file_buffer+0x10e/0x1e3 [jbd]
  [<c0146cfd>] blockable_page_cache_readahead+0x46/0x99
  [<c0146dc0>] make_ahead_window+0x70/0x8d
  [<c01214ab>] current_fs_time+0x4a/0x53
  [<c016fc2a>] touch_atime+0x60/0x92
  [<c01afd93>] blkdev_driver_ioctl+0x4b/0x5b
  [<c01b03a9>] blkdev_ioctl+0x606/0x655
  [<c0145841>] __alloc_pages+0x4e/0x275
  [<c014170b>] find_get_page+0x18/0x38
  [<c0143ccd>] filemap_nopage+0x19c/0x313
  [<c014c2ee>] __handle_mm_fault+0x408/0x740
  [<c01154b6>] do_page_fault+0x18a/0x481
  [<c016041d>] block_ioctl+0x13/0x16
  [<c016040a>] block_ioctl+0x0/0x16
  [<c0169268>] do_ioctl+0x1c/0x5d
  [<c01694f3>] vfs_ioctl+0x24a/0x25c
  [<c016954d>] sys_ioctl+0x48/0x5f
  [<c0102c11>] sysenter_past_esp+0x56/0x79


 svnadmin      D 00000000     0 11790  11777                     (NOTLB)
c7a45dcc 00200082 5fed7199 00000000 c8938030 00000001 c65e8550 c02c66a0 01a2358a 00018416 00012c23 00000000 c65e8660 c11036e0 c014743b 00001000 00000000 c7a45da8 c7a45da8 c7a45e58 000000ff 00000000 00000000 c11036e0
 Call Trace:
  [<c014743b>] __pagevec_lru_add+0x90/0x9b
  [<c027f65c>] io_schedule+0x26/0x30
  [<c0141782>] sync_page+0x0/0x40
  [<c01417bf>] sync_page+0x3d/0x40
  [<c027f857>] __wait_on_bit_lock+0x2a/0x52
  [<c014177c>] __lock_page+0x51/0x57
  [<c012d95a>] wake_bit_function+0x0/0x3c
  [<c0141e32>] do_generic_mapping_read+0x1c9/0x42a
  [<c01428d2>] __generic_file_aio_read+0x16b/0x1b2
  [<c01415bb>] file_read_actor+0x0/0xca
  [<c0142954>] generic_file_aio_read+0x3b/0x42
  [<c0159e13>] do_sync_read+0xb6/0xf1
  [<c012d92d>] autoremove_wake_function+0x0/0x2d
  [<c0159d5d>] do_sync_read+0x0/0xf1
  [<c015a71c>] vfs_read+0x9f/0x141
  [<c015ab68>] sys_read+0x3c/0x63
  [<c0102c11>] sysenter_past_esp+0x56/0x79


 du            D 00000000     0 11814  11795                     (NOTLB)
c42c3e5c 00000086 037d7455 00000000 c1ea13ac 00000008 c3c55550 c02c66a0 61dec800 00018432 00249f24 00000000 c3c55660 c11036e0 00000000 00000004 c015bacf c77b196c c3b71c34 c10ee840 000000ff 00000000 00000000 c11036e0
 Call Trace:
  [<c015bacf>] __find_get_block_slow+0xfb/0x105
  [<c027f65c>] io_schedule+0x26/0x30
  [<c015c546>] sync_buffer+0x0/0x33
  [<c015c576>] sync_buffer+0x30/0x33
  [<c027f857>] __wait_on_bit_lock+0x2a/0x52
  [<c015c546>] sync_buffer+0x0/0x33
  [<c027f8e1>] out_of_line_wait_on_bit_lock+0x62/0x6a
  [<c012d95a>] wake_bit_function+0x0/0x3c
  [<c015c698>] __lock_buffer+0x21/0x24
  [<c88f56d2>] do_get_write_access+0x4c/0x462 [jbd]
  [<c892a4cf>] __ext3_get_inode_loc+0x109/0x2b9 [ext3]
  [<c88f5b00>] journal_get_write_access+0x18/0x26 [jbd]
  [<c892a6bf>] ext3_reserve_inode_write+0x2f/0x78 [ext3]
  [<c892a719>] ext3_mark_inode_dirty+0x11/0x27 [ext3]
  [<c892d06a>] ext3_dirty_inode+0x53/0x66 [ext3]
  [<c0176f11>] __mark_inode_dirty+0x27/0x15a
  [<c016fc2a>] touch_atime+0x60/0x92
  [<c0169733>] vfs_readdir+0x76/0x8d
  [<c0169564>] filldir64+0x0/0xc3
  [<c01697ad>] sys_getdents64+0x63/0xa5
  [<c0102c11>] sysenter_past_esp+0x56/0x79


 ls            D 00000000     0 11920  11902                     (NOTLB)
c71d3f54 00000086 018f610b 00000000 c02ccbe8 00000008 c4a8d000 c02c66a0 9d38226e 00018485 00601f9e 00000000 c4a8d110 c11036e0 00001000 00000000 01b11067 c1036220 c10582ac c2c15180 000000ff 00000000 00000000 c72ffb28
 Call Trace:
  [<c027faff>] __mutex_lock_slowpath+0x4a/0x79
  [<c027fb33>] .text.lock.mutex+0x5/0x14
  [<c0169708>] vfs_readdir+0x4b/0x8d
  [<c0169564>] filldir64+0x0/0xc3
  [<c01697ad>] sys_getdents64+0x63/0xa5
  [<c0102c11>] sysenter_past_esp+0x56/0x79


--
Eric Gerlach, Network Administrator
Federation of Students
University of Waterloo
p: (519) 888-4567 x36329
e: egerlach@...135...



Reply to: