[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] nbd hanging doing an svnadmin hotcopy



About an hour and a half after it froze it started going again, but then froze on the next repository (it's a batch job that backs up all the repositories), and has been frozen for the last six hours or so. Now svnadmin is stuck in a different place, though nbd-client seems to be in roughly the same place....

Again, if there's anyone out there who can offer a pointer (or where more pertinent data could be found), it would be much appreciated.

nbd-client    S 00000001     0 11253      1                2774 (NOTLB)
c1dafacc 00000082 b6c82658 00000001 c0254674 0000000a c4b07550 c4a76aa0 f869f095 00019b9c 000452d4 00000000 c4b07660 c11036e0 c53d5b80 c70a7c80 05523c97 00000246 c45f1a34 c53d5b80 c70a7c80 c0253973 c45f1a34 7fffffff
 Call Trace:
  [<c0254674>] tcp_transmit_skb+0x604/0x632
  [<c0253973>] tcp_rcv_established+0x75e/0x7b7
  [<c027f78a>] schedule_timeout+0x13/0x8c
  [<c028085f>] _spin_lock_bh+0x8/0x18
  [<c022216f>] release_sock+0xc/0x91
  [<c02228c2>] sk_wait_data+0x67/0x98
  [<c012d92d>] autoremove_wake_function+0x0/0x2d
  [<c024d06a>] tcp_recvmsg+0x397/0x9e9
  [<c0221c46>] sock_common_recvmsg+0x2f/0x45
  [<c021ff1e>] sock_recvmsg+0xe5/0x100
  [<c0242d14>] ip_local_deliver+0x15b/0x207
  [<c012d92d>] autoremove_wake_function+0x0/0x2d
  [<c022a7d3>] process_backlog+0x7a/0xe7
  [<c0116412>] __activate_task+0x1c/0x29
  [<c01036b6>] common_interrupt+0x1a/0x20
  [<c0116412>] __activate_task+0x1c/0x29
  [<c02219db>] kernel_recvmsg+0x2b/0x3a
  [<c8a1b123>] sock_xmit+0x123/0x22c [nbd]
  [<c012d93a>] autoremove_wake_function+0xd/0x2d
  [<c011624d>] __wake_up_common+0x2f/0x53
  [<c011669e>] __wake_up+0x2a/0x3d
  [<c01ad8dc>] clear_queue_congested+0x35/0x38
  [<c8a1b99a>] nbd_ioctl+0x32e/0x684 [nbd]
  [<c0146b38>] __do_page_cache_readahead+0x69/0x1e8


svnadmin      D 00000000     0 12182  11777                     (NOTLB)
c2233dcc 00200086 76e296d9 00000000 c1ac9080 0000000a c4a8d000 c24c9000 a5df090e 00018800 0003af32 00000000 c4a8d110 c11036e0 8888dac6 00001000 00000000 c2233da8 c2233da8 c2233e58 c3bf5180 00002efc 00200046 c11036e0
 Call Trace:
  [<c027f65c>] io_schedule+0x26/0x30
  [<c0141782>] sync_page+0x0/0x40
  [<c01417bf>] sync_page+0x3d/0x40
  [<c027f857>] __wait_on_bit_lock+0x2a/0x52
  [<c014177c>] __lock_page+0x51/0x57
  [<c012d95a>] wake_bit_function+0x0/0x3c
  [<c0141e32>] do_generic_mapping_read+0x1c9/0x42a
  [<c01428d2>] __generic_file_aio_read+0x16b/0x1b2
  [<c01415bb>] file_read_actor+0x0/0xca
  [<c0142954>] generic_file_aio_read+0x3b/0x42
  [<c0159e13>] do_sync_read+0xb6/0xf1
  [<c012d92d>] autoremove_wake_function+0x0/0x2d
  [<c0159d5d>] do_sync_read+0x0/0xf1
  [<c015a71c>] vfs_read+0x9f/0x141
  [<c015ab68>] sys_read+0x3c/0x63
  [<c0102c7b>] syscall_call+0x7/0xb


Eric Gerlach wrote:
Hi,

I'm testing out using nbd to host svn repositories because nfs doesn't work. However, when tyring to do an svnadmin hotcopy the whole thing has hung. I can't even kill -9 the processes involved, so I'm assuming it's in the kernel.

I thought it might be a TCP timeout issue, so I've waited long past the 15 minute TCP timeout (read that somewhere), but not one bit has changed in the call traces of the processes during that time.

It's a Debian Etch system, running the latest (2.6.18-5) kernel with nbd-client 2.8.7 userland. I've pulled the call traces of the offending process if that helps anyone. If I'm asking for help in the wrong place, let me know so I can try there. However if anyone has any experience with this, it would be greatly appreciated.

Thanks very much in advance for your help.

Cheers,

Eric

--
Eric Gerlach, Network Administrator
Federation of Students
University of Waterloo
p: (519) 888-4567 x36329
e: egerlach@...135...



Reply to: