Hi, In the last couple of days, I've begun to see both kernel errors and SMART warnings about my laptop's two and a half year old hard drive. An excerpt of a current 'dmesg | grep hda' (these errors occurred upon resuming from suspend to disk): [34074.459505] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } [34074.459685] hda: dma_intr: error=0x84 { DriveStatusError BadCRC } [34074.459886] hda: possibly failed opcode: 0x25 [34079.744751] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } [34079.744931] hda: dma_intr: error=0x84 { DriveStatusError BadCRC } [34079.745135] hda: possibly failed opcode: 0x25 [34079.750086] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } [34079.750263] hda: dma_intr: error=0x84 { DriveStatusError BadCRC } [34079.750466] hda: possibly failed opcode: 0x25 [34079.789002] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } [34079.789192] hda: dma_intr: error=0x84 { DriveStatusError BadCRC } [34079.789411] hda: possibly failed opcode: 0x25 [34079.794851] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } [34079.795043] hda: dma_intr: error=0x84 { DriveStatusError BadCRC } [34079.795261] hda: possibly failed opcode: 0x25 I ran the short and long SMART self-tests, and they seem clean: smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 5880 - # 2 Short offline Completed without error 00% 5879 - # 3 Short offline Completed without error 00% 1435 - [#1 and #2 are the ones I ran yesterday, IIUC.] I've attached the output of '# smartctl -a /dev/hda' to this mail. Here's an excerpt of syslog ('grep smartd /var/log/syslog', with a bunch of 'Temperature_Celsius changed' lines removed, since I think they're normal): Jun 9 15:12:29 lizzie smartd[3474]: Device: /dev/hda, SMART Usage Attribute: 191 G-Sense_Error_Rate changed from 100 to 99 Jun 9 15:12:29 lizzie smartd[3474]: Device: /dev/hda, ATA error count increased from 12 to 17 Jun 9 15:12:29 lizzie smartd[3474]: Sending warning via mail to root@localhost ... Jun 9 15:12:29 lizzie smartd[3474]: Warning via mail to root@localhost: successful Jun 9 19:09:49 lizzie smartd[3474]: Device: /dev/hda, ATA error count increased from 17 to 28 Jun 9 20:42:29 lizzie smartd[3474]: Device: /dev/hda, SMART Usage Attribute: 191 G-Sense_Error_Rate changed from 99 to 100 Jun 10 14:09:30 lizzie smartd[3474]: Device: /dev/hda, SMART Prefailure Attribute: 2 Throughput_Performance changed from 100 to 105 Jun 10 14:09:30 lizzie smartd[3474]: Device: /dev/hda, SMART Prefailure Attribute: 3 Spin_Up_Time changed from 151 to 152 Jun 10 14:09:30 lizzie smartd[3474]: Device: /dev/hda, SMART Prefailure Attribute: 8 Seek_Time_Performance changed from 100 to 126 Jun 10 14:09:30 lizzie smartd[3474]: Device: /dev/hda, ATA error count increased from 28 to 34 So far, the only actual problem that I've noticed is a (single) failure to resume from disk yesterday, with some message (I neglected to save it) about a checksum failure, which I believe was accompanied by some kernel errors similar to the ones that I've reproduced above. Is this drive going? What further tests / diagnostics can I do? [Yes, I have backups, and I'm going to redouble my attention to keeping them current making sure that they're comprehensive.] Celejar -- mailmin.sourceforge.net - remote access via secure (OpenPGP) email ssuds.sourceforge.net - A Simple Sudoku Solver and Generator
Attachment:
smart-info
Description: Binary data