[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

was getting disk failure errors, repaired the sectors, now what?



I noticed that when I rebooted my machine earlier today, it would not
load the kernel and it was giving some "media error" messages.

I did various basic hardware debugging and ended up with my hard disk's
manufacturer's diagnostic utility telling me that there were bad sectors
on the drive. This was from a Windows 7 machine. But it would not repair
the disk. Searched some more and realized I should try it from a boot
disk (as opposed to from within Windows) created from the diagnostic
utility. So I did that, rebooted in DR DOS and ran the test again. This
time the test reported errors but also repaired them.


Now, I checked my logs from last few days and it seems like the problem
started only 3 or 4 days ago (the errors are given further below). The
problem appears to be in /dev/sdb3. The good news is that I do regular
backups of my /home (don't care about the system files, I can always
reinstall it), so I wasn't worried about losing any data (the OS and
/home partitions are on sda). The bad news is that my backups are on
/deb/sdb3 :(


So now I know that my backups most probably are not trustworthy, the
ones from the last four or so days. No problem. I do rolling backups
using cron and rsync. But what I do now? Do I just delete the backups
from the last four days and resume regular ones? How risky is the
partition even though the manufacturer's diagnostic utility reports no
errors now.

Another thing, I get this when I log in to a console, what is it all about?
#-------------------------------------------------------------#
Last login: Wed Jun 30 17:23:40 EDT 2010 from localhost on pts/9
/etc/update-motd.d/20-cpu-checker: line 3:
/usr/lib/update-notifier/update-motd-cpu-checker: No such file or directory
/etc/update-motd.d/20-cpu-checker: line 3: exec:
/usr/lib/update-notifier/update-motd-cpu-checker: cannot execute: No
such file
or directory
run-parts: /etc/update-motd.d/20-cpu-checker exited with return code 126
/etc/update-motd.d/90-updates-available: line 3:
/usr/lib/update-notifier/update-motd-updates-available: No such file or
directory
/etc/update-motd.d/90-updates-available: line 3: exec:
/usr/lib/update-notifier/update-motd-updates-available: cannot execute:
No such file or directory
run-parts: /etc/update-motd.d/90-updates-available exited with return
code 126
/etc/update-motd.d/98-reboot-required: line 3:
/usr/lib/update-notifier/update-motd-reboot-required: No such file or
directory
/etc/update-motd.d/98-reboot-required: line 3: exec:
/usr/lib/update-notifier/update-motd-reboot-required: cannot execute: No
such file or directory
run-parts: /etc/update-motd.d/98-reboot-required exited with return code 126
Linux red 2.6.32-100601-red-1394 #1 Tue Jun 1 00:13:15 EDT 2010 i686

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
You have mail.
#-------------------------------------------------------------#


The errors from the log are given below.

#-------------------------------------------------------------#
Jun 28 10:30:01 red /USR/SBIN/CRON[11462]: (CRON) error (grandchild
#11463 failed with exit status 5)
Jun 28 10:30:01 red kernel: [210131.012427] sd 0:0:1:0: [sdb] Unhandled
error code
Jun 28 10:30:01 red kernel: [210131.012436] sd 0:0:1:0: [sdb] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 28 10:30:01 red kernel: [210131.012444] sd 0:0:1:0: [sdb] CDB:
Read(10): 28 00 09 6d 24 54 00 00 08 00
Jun 28 10:30:01 red kernel: [210131.012459] end_request: I/O error, dev
sdb, sector 158147668
Jun 28 10:30:01 red kernel: [210131.012489] EXT3-fs error (device sdb3):
ext3_get_inode_loc: unable to read inode block - inode=2894305,
block=5799941
Jun 28 10:51:48 red -- MARK --
Jun 28 11:08:26 red smartd[1577]: Device: /dev/sda [SAT], SMART Usage
Attribute: 194 Temperature_Celsius changed from 98 to 96
Jun 28 11:17:02 red /USR/SBIN/CRON[11544]: (root) CMD (   cd / &&
run-parts --report /etc/cron.hourly)
Jun 28 11:31:48 red -- MARK --
Jun 28 11:51:48 red -- MARK --
Jun 28 11:55:46 red kernel: [215276.064441] sd 0:0:1:0: [sdb] Unhandled
error code
Jun 28 11:55:46 red kernel: [215276.064450] sd 0:0:1:0: [sdb] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 28 11:55:46 red kernel: [215276.064458] sd 0:0:1:0: [sdb] CDB:
Read(10): 28 00 00 06 12 6f 00 00 08 00
Jun 28 11:55:46 red kernel: [215276.064473] end_request: I/O error, dev
sdb, sector 397935
Jun 28 11:55:46 red kernel: [215276.064497] EXT3-fs error (device sdb5):
ext3_find_entry: reading directory #2 offset 0
Jun 28 11:55:46 red kernel: [215276.064553] sd 0:0:1:0: [sdb] Unhandled
error code
Jun 28 11:55:46 red kernel: [215276.064557] sd 0:0:1:0: [sdb] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 28 11:55:46 red kernel: [215276.064563] sd 0:0:1:0: [sdb] CDB:
Write(10): 2a 00 00 05 e2 57 00 00 08 00
Jun 28 11:55:46 red kernel: [215276.064576] end_request: I/O error, dev
sdb, sector 385623
Jun 28 11:55:46 red kernel: [215276.064581] __ratelimit: 11 callbacks
suppressed
Jun 28 11:55:46 red kernel: [215276.064586] Buffer I/O error on device
sdb5, logical block 0
Jun 28 11:55:46 red kernel: [215276.064591] lost page write due to I/O
error on sdb5
#-------------------------------------------------------------#

#-------------------------------------------------------------#
Jun 28 22:00:02 red kernel: [251531.627158] EXT3-fs error (device sdb3):
ext3_remount: Abort forced by user
Jun 28 22:00:02 red /USR/SBIN/CRON[12219]: (CRON) error (grandchild
#12220 failed with exit status 32)
#-------------------------------------------------------------#

#-------------------------------------------------------------#
Jun 29 07:38:54 red kernel: [286263.445752] sd 0:0:1:0: [sdb] Unhandled
error code
Jun 29 07:38:54 red kernel: [286263.445761] sd 0:0:1:0: [sdb] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 29 07:38:54 red kernel: [286263.445769] sd 0:0:1:0: [sdb] CDB:
Read(10): 28 00 06 a9 34 4c 00 00 08 00
Jun 29 07:38:54 red kernel: [286263.445785] end_request: I/O error, dev
sdb, sector 111752268
Jun 29 07:38:54 red kernel: [286263.445811] EXT3-fs error (device sdb3):
ext3_find_entry: reading directory #2 offset 0
Jun 29 07:38:54 red kernel: [286263.457980] sd 0:0:1:0: [sdb] Unhandled
error code
Jun 29 07:38:54 red kernel: [286263.457988] sd 0:0:1:0: [sdb] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 29 07:38:54 red kernel: [286263.457995] sd 0:0:1:0: [sdb] CDB:
Read(10): 28 00 00 06 12 6f 00 00 08 00
Jun 29 07:38:54 red kernel: [286263.458010] end_request: I/O error, dev
sdb, sector 397935
Jun 29 07:38:54 red kernel: [286263.458369] EXT3-fs error (device sdb5):
ext3_find_entry: reading directory #2 offset 0
Jun 29 07:38:54 red kernel: [286263.465344] sd 0:0:1:0: [sdb] Unhandled
error code
Jun 29 07:38:54 red kernel: [286263.465352] sd 0:0:1:0: [sdb] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 29 07:38:54 red kernel: [286263.465359] sd 0:0:1:0: [sdb] CDB:
Write(10): 2a 00 00 05 e2 57 00 00 08 00
Jun 29 07:38:54 red kernel: [286263.465374] end_request: I/O error, dev
sdb, sector 385623
Jun 29 07:38:54 red kernel: [286263.465383] Buffer I/O error on device
sdb5, logical block 0
Jun 29 07:38:54 red kernel: [286263.465388] lost page write due to I/O
error on sdb5
Jun 29 07:38:54 red kernel: [286263.475622] sd 0:0:1:0: [sdb] Unhandled
error code
Jun 29 07:38:54 red kernel: [286263.475631] sd 0:0:1:0: [sdb] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 29 07:38:54 red kernel: [286263.475639] sd 0:0:1:0: [sdb] CDB:
Read(10): 28 00 0a 27 33 cb 00 00 08 00
Jun 29 07:38:54 red kernel: [286263.475654] end_request: I/O error, dev
sdb, sector 170341323
Jun 29 07:38:54 red kernel: [286263.475681] EXT3-fs error (device sdb4):
ext3_find_entry: reading directory #2 offset 0
Jun 29 07:38:54 red kernel: [286263.475713] sd 0:0:1:0: [sdb] Unhandled
error code
Jun 29 07:38:54 red kernel: [286263.475717] sd 0:0:1:0: [sdb] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 29 07:38:54 red kernel: [286263.475723] sd 0:0:1:0: [sdb] CDB:
Read(10): 28 00 0b d1 c8 33 00 00 08 00
Jun 29 07:38:54 red kernel: [286263.475735] end_request: I/O error, dev
sdb, sector 198297651
Jun 29 07:38:54 red kernel: [286263.475747] EXT3-fs error (device sdb4):
ext3_find_entry: reading directory #2 offset 1
#-------------------------------------------------------------#



Thanks.

-- 

Please reply to this list only. I read this list on its corresponding
newsgroup on gmane.org. Replies sent to my email address are just
filtered to a folder in my mailbox and get periodically deleted without
ever having been read.


Reply to: