Re: Bad blocks and powernowd

To: debian-user@lists.debian.org
Subject: Re: Bad blocks and powernowd
From: "Davide Mancusi" <arekfu@gmail.com>
Date: Mon, 19 Jan 2009 16:03:28 +0100
Message-id: <[🔎] 1fc2ea170901190703s8fe06c3r928b6059e1b1d178@mail.gmail.com>
In-reply-to: <[🔎] 49708ACF.7000009@physik.blm.tu-muenchen.de>
References: <[🔎] 1fc2ea170901160428m607fe4ccr8317f44aae351f1d@mail.gmail.com> <[🔎] 49708ACF.7000009@physik.blm.tu-muenchen.de>

2009/1/16 Johannes Wiedersich <johannes@physik.blm.tu-muenchen.de>:
> Davide Mancusi wrote:
>> The hard disk of my 4-year-old laptop is starting to fail. I ran
>> fsck.ext3 -c on my root partition yesterday and a few blocks were
>> marked as damaged. The blocks contained some XFCE4 theme files, so I
>> thought that reinstalling the relevant package should be enough. Now,
>> however, the machine hangs every time I start powernowd. Kernel
>> emergency key presses (Alt+SysRq+?) don't work and the usual log files
>> don't contain any relevant information. I have tried uninstalling and
>> reinstalling the powernowd package, but it didn't help; note also that
>> fsck did not signal any damaged files belonging to powernowd.
>>
>> Can anyone help me sort this out? Could it be that fsck -c did not
>> mark some blocks as damaged because I ran it with the root partition
>> mounted read-only (as opposed to unmounted)?
>
> If your disk is dying this could mean about anything.
>
> Try smartctl from smartmontools package. What does it report about the
> health status of your disk (after some testing)?
>
> Try e2fsck again to see, if it detects 'new' errors on your file system.
>
> I hope you have good back ups. You could try diff -r against your backup
> (mounted ro). However, if your disk is damaged and loads and runs
> garbled kernel stuff, you risk hosing your backup. Therefore it might be
> safer to investigate by booting a rescue system from CD or usb-disk. YMMV.

Thanks for your response, Johannes.

Now I'm confused. I installed smartmontools, I ran
# smartctl -t long /dev/hda
and I detected two bad sectors. I followed the HOWTO at [1] and
reallocated the first one. (I had no idea one could recover bad
sectors. I thought they were as good as gone.) Then I ran the test
again to get the LBA address of the second bad block. Surprise,
surprise, the test completed without problems.

I also tried booting off a live CD and running e2fsck -c -c on all
ext2/3 partitions. No bad blocks were detected, but one of the inode
tables was heavily modified. However, even though no files related to
powernowd were touched, powernowd now works again.

>From the live CD I ran again
# smartctl -t long /dev/hda
[waited one hour]
# smartctl -l selftest /dev/hda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      6264         -
# 2  Extended offline    Completed without error       00%      6262         -
# 3  Extended offline    Completed without error       00%      6259         -
# 4  Short offline       Completed without error       00%      6258         -
# 5  Extended offline    Completed: read failure       30%      6257
      95245863

You can see that the last test completed without errors. However:

# smartctl -A /dev/hda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   062    Pre-fail
Always       -       0
  2 Throughput_Performance  0x0005   105   105   040    Pre-fail
Offline      -       5874
  3 Spin_Up_Time            0x0007   200   200   033    Pre-fail
Always       -       1
  4 Start_Stop_Count        0x0012   096   096   000    Old_age
Always       -       6796
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail
Always       -       0
  8 Seek_Time_Performance   0x0005   120   120   040    Pre-fail
Offline      -       36
  9 Power_On_Hours          0x0012   086   086   000    Old_age
Always       -       6268
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       1167
191 G-Sense_Error_Rate      0x000a   100   100   000    Old_age
Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age
Always       -       65
193 Load_Cycle_Count        0x0012   065   065   000    Old_age
Always       -       359932
194 Temperature_Celsius     0x0002   130   130   000    Old_age
Always       -       42 (Lifetime Min/Max 11/58)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age
Always       -       9
197 Current_Pending_Sector  0x0022   100   100   000    Old_age
Always       -       1
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age
Always       -       0

I still have Current_Pending_Sector==1 and smartd sends me an e-mail
at every reboot and complains about it. What should I do?

Davide

[1] http://tinyurl.com/83g265

Reply to:

References:
- Bad blocks and powernowd
  - From: "Davide Mancusi" <arekfu@gmail.com>
- Re: Bad blocks and powernowd
  - From: Johannes Wiedersich <johannes@physik.blm.tu-muenchen.de>

Prev by Date: Re: [locales] change date/time display format for german locale in lenny like it has been on etch
Next by Date: Re: Can't start X after upgrade to Lenny
Previous by thread: Re: Bad blocks and powernowd
Next by thread: System hangs after: Running /scripts/init-bottom
Index(es):
- Date
- Thread