[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RAID1 problem - server freezes on md data-check



First let me say thank you to all who shared their experience and
knowledge. It was really helpful.

Yesterday I managed to replace 1.5Gb/s drive with 3.0Gb/s drive and now
both drives are identical. The replacement required to rebuild an array
and it passed but with one exception: at the end of reconstruction
process I got "task * blocked for more than 120 seconds" messages in my
logs:

Jan  4 23:38:35 barambo1 kernel: [12517.683173] INFO: task
kjournald:1088 blocked for more than 120 seconds.
Jan  4 23:38:35 barambo1 kernel: [12517.683227] "echo 0
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  4 23:38:35 barambo1 kernel: [12517.683310] kjournald     D 0735ccb1
0  1088      2
Jan  4 23:38:35 barambo1 kernel: [12517.683313]        f78ef0c0 00000046
f7817c28 0735ccb1 00000a43 f78ef24c c180bfc0 00000000 
Jan  4 23:38:35 barambo1 kernel: [12517.683319]        f7495edc 008e3f0c
000006b2 00000000 008e3f0c f7495edc 008e3f0c f7939f18 
Jan  4 23:38:35 barambo1 kernel: [12517.683326]        c180bfc0 01451000
f7939f18 c1801688 c02b8a70 f7939f10 00000000 c019098e 
Jan  4 23:38:35 barambo1 kernel: [12517.683332] Call Trace:
Jan  4 23:38:35 barambo1 kernel: [12517.683339]  [<c02b8a70>]
io_schedule+0x49/0x80
Jan  4 23:38:35 barambo1 kernel: [12517.683343]  [<c019098e>]
sync_buffer+0x30/0x33
Jan  4 23:38:35 barambo1 kernel: [12517.683347]  [<c02b8c5e>]
__wait_on_bit+0x33/0x58
Jan  4 23:38:35 barambo1 kernel: [12517.683351]  [<c019095e>]
sync_buffer+0x0/0x33
Jan  4 23:38:35 barambo1 kernel: [12517.683355]  [<c019095e>]
sync_buffer+0x0/0x33
Jan  4 23:38:35 barambo1 kernel: [12517.683358]  [<c02b8ce2>]
out_of_line_wait_on_bit+0x5f/0x67
Jan  4 23:38:35 barambo1 kernel: [12517.683364]  [<c01319c9>]
wake_bit_function+0x0/0x3c
Jan  4 23:38:35 barambo1 kernel: [12517.683369]  [<c019092a>]
__wait_on_buffer+0x16/0x18
Jan  4 23:38:35 barambo1 kernel: [12517.683373]  [<f894fd7a>]
journal_commit_transaction+0x6cf/0xb3d [jbd]
Jan  4 23:38:35 barambo1 kernel: [12517.683386]  [<c0129b2c>]
lock_timer_base+0x19/0x35
Jan  4 23:38:35 barambo1 kernel: [12517.683393]  [<f8952468>] kjournald
+0xa5/0x1c6 [jbd]
Jan  4 23:38:35 barambo1 kernel: [12517.683402]  [<c013199c>]
autoremove_wake_function+0x0/0x2d
Jan  4 23:38:35 barambo1 kernel: [12517.683406]  [<f89523c3>] kjournald
+0x0/0x1c6 [jbd]
Jan  4 23:38:35 barambo1 kernel: [12517.683414]  [<c01318db>] kthread
+0x38/0x5d
Jan  4 23:38:35 barambo1 kernel: [12517.683417]  [<c01318a3>] kthread
+0x0/0x5d
Jan  4 23:38:35 barambo1 kernel: [12517.683421]  [<c01044f3>]
kernel_thread_helper+0x7/0x10
Jan  4 23:38:35 barambo1 kernel: [12517.683426]  =======================

(please check attached file with similar messages for different
processes) However, after several minutes server returned to it's normal
state and since then working fine. Now it's running
linux-image-2.6.26-2-686 stock kernel. Any ideas?

Best Regards,

George Chelidze

Attachment: kern.log.gz
Description: GNU Zip compressed data


Reply to: