Shutdown due to thermal event (sid)
Hi,
It is now some time I get nightly shutdowns "due to thermal event" (so the Bios says).
I'm pretty sure the processor did not break down and the machine is in general good health (I can run XP without problems).
One "strange" thing is the breakdown happens in the night, shortly after 1 am.
I'm not aware of programs running at that time, but something must be triggering the overheating.
The machine is a: Intel(R) Pentium(R) 4 CPU 2.80GHz based on an Intel motherboard (D865GBF) with 1Gb RAM, a lot of disks (I added a PCI board with tertiary and quaternary IDE channels) running:
Linux heimdall 2.6.12-1-686 #1 Tue Sep 27 12:52:50 JST 2005 i686 GNU/Linux
from a very current version of debian/sid.
The machine is primarily used as server (file, www, DNS, DHCP, ADSL), but somethimes I run programs on it.
I did some homework and, in particular, I installed sensord.
The results are:
adm1027-i2c-0-2e
Adapter: SMBus I801 adapter at c400
V1.5: +1.481 V (min = +1.42 V, max = +1.58 V)
VCore: +1.354 V (min = +1.30 V, max = +1.44 V)
V3.3: +3.399 V (min = +3.13 V, max = +3.47 V)
V5: +5.130 V (min = +4.74 V, max = +5.26 V)
V12: +12.063 V (min = +11.38 V, max = +12.62 V)
CPU_Fan: 2828 RPM (min = 4000 RPM) ALARM
fan2: 0 RPM (min = 0 RPM)
fan3: 0 RPM (min = 0 RPM)
fan4: 645 RPM (min = 0 RPM)
CPU: +41.00°C (low = +10°C, high = +50°C)
Board: +31.50°C (low = +10°C, high = +35°C)
Remote: +33.25°C (low = +10°C, high = +35°C)
CPU_PWM: 255
Fan2_PWM: 255
Fan3_PWM: 77
vid: +1.375 V (VRM Version 9.1)
Philips PAL_BG -i2c-1-61
Adapter: bt878 #0 [sw]
tveeprom-i2c-1-50
Adapter: bt878 #0 [sw]
As You can see the temperatures look quite reasonable, but, under heavy load, they start to rise and then seem to stabilize around 55°C. I admit I got nervous and lowered the load in less than 2 min, so the "stabilization" could be apparent :)
The "bad part" is I could not see any change in the fan activity, which seems to be completely independent from the temperature.
I tried to install the fancontrol script (from the sensors package), but the configuration script (pwmconfig) failed badly:
...
Found the following PWM controls:
0-002e/pwm1
/usr/sbin/pwmconfig: line 102: 0-002e/pwm1_enable: Permission denied
0-002e/pwm2
/usr/sbin/pwmconfig: line 102: 0-002e/pwm2_enable: Permission denied
0-002e/pwm3
/usr/sbin/pwmconfig: line 102: 0-002e/pwm3_enable: Permission denied
Found the following fan sensors:
0-002e/fan1_input current speed: 2811 RPM
0-002e/fan2_input current speed: 0 ... skipping!
0-002e/fan3_input current speed: 0 ... skipping!
0-002e/fan4_input current speed: 661 RPM
...
which looks real bad :(
The incriminated line tries to access /proc/sys/dev/sensors/... that is not present on my machine (/proc/sys/dev holds only four subdirs: cdrom, hpet, parport and scsi). I have to assume that something went wrong with my installation of sensors, but I cannot understand what! Sensord/sensors/xsensors seem to work fine. What am I missing?
The list of the loaded modules follows:
Module Size Used by
ipt_MASQUERADE 3168 1
ipt_REJECT 5504 4
ipt_LOG 7168 10
ip6table_filter 2560 0
ip6_tables 18752 1 ip6table_filter
ipt_state 1696 15
ipt_pkttype 1440 4
iptable_raw 1824 0
ipt_CONNMARK 2016 0
ipt_MARK 2304 0
ipt_connmark 1472 0
ipt_owner 2976 0
ipt_recent 10764 0
ipt_iprange 1568 0
ipt_physdev 2032 0
ipt_multiport 2432 0
ipt_conntrack 2240 3
iptable_mangle 2656 0
ip_nat_irc 2528 0
ip_nat_tftp 1728 0
ip_nat_ftp 3328 0
iptable_nat 23092 5 ipt_MASQUERADE,ip_nat_irc,ip_nat_tftp,ip_nat_ftp
ip_conntrack_irc 71664 1 ip_nat_irc
ip_conntrack_tftp 4048 1 ip_nat_tftp
ip_conntrack_ftp 72848 1 ip_nat_ftp
ip_conntrack 44536 10 ipt_MASQUERADE,ipt_state,ipt_conntrack,ip_nat_irc,ip_nat_tftp,ip_nat_ftp,iptable_nat,ip_conntrack_irc,ip_conntrack_tftp,ip_conn
track_ftp
iptable_filter 2784 1
ip_tables 20128 18 ipt_MASQUERADE,ipt_REJECT,ipt_LOG,ipt_state,ipt_pkttype,iptable_raw,ipt_CONNMARK,ipt_MARK,ipt_connmark,ipt_owner,ipt_recent,ipt_iprange,ipt_physdev,ipt_multiport,ipt_conntrack,iptable_mangle,iptable_nat,ipta
ble_filter
ppp_synctty 9824 1
ppp_generic 29620 5 ppp_synctty
slhc 7168 1 ppp_generic
n_hdlc 9188 1
i915 20512 1
drm 67732 2 i915
nfsd 225760 9
exportfs 5792 1 nfsd
af_packet 22216 2
lp 12164 0
thermal 13224 0
fan 4516 0
button 6416 0
processor 21876 1 thermal
ac 4612 0
battery 9348 0
ipv6 261984 14
nfs 217544 3
lockd 64968 3 nfsd,nfs
sunrpc 142180 14 nfsd,nfs,lockd
ext2 69800 1
dm_mod 60540 0
eeprom 7280 0
lm85 35812 0
i2c_sensor 3264 2 eeprom,lm85
i2c_isa 1888 0
usbhid 36480 0
usbmouse 5376 0
sd_mod 19664 4
e100 35968 0
tuner 27688 0
tvaudio 23716 0
eepro100 30864 0
mii 5696 2 e100,eepro100
bttv 157456 0
video_buf 21828 1 bttv
firmware_class 10112 1 bttv
i2c_algo_bit 9576 1 bttv
snd_bt87x 14536 0
v4l2_common 5696 1 bttv
btcx_risc 4968 1 bttv
tveeprom 13080 1 bttv
videodev 9568 1 bttv
snd_intel8x0 34016 0
snd_ac97_codec 83960 1 snd_intel8x0
hw_random 5204 0
snd_pcm 93416 3 snd_bt87x,snd_intel8x0,snd_ac97_codec
tpm_atmel 4800 0
snd_timer 24644 1 snd_pcm
snd 56260 5 snd_bt87x,snd_intel8x0,snd_ac97_codec,snd_pcm,sn
d_timer
parport_pc 36708 1
parport 36936 2 lp,parport_pc
ata_piix 9636 2
soundcore 9696 1 snd
libata 49604 1 ata_piix
tpm_nsc 6592 0
tpm 10432 2 tpm_atmel,tpm_nsc
snd_page_alloc 9860 3 snd_bt87x,snd_intel8x0,snd_pcm
i8xx_tco 7028 0
i2c_i801 8716 0
scsi_mod 138472 2 sd_mod,libata
i2c_core 21776 10 eeprom,lm85,i2c_sensor,i2c_isa,tuner,tvaudio,bttv,i2c_algo_bit,tveeprom,i2c_i801
ide_cd 43140 0
shpchp 99428 0
ehci_hcd 35336 0
uhci_hcd 32176 0
pci_hotplug 28468 1 shpchp
intel_agp 24092 1
cdrom 40640 1 ide_cd
usbcore 122300 5 usbhid,usbmouse,ehci_hcd,uhci_hcd
psmouse 31236 0
serio_raw 7108 0
intelfb 32800 0
agpgart 35560 4 drm,intel_agp,intelfb
evdev 9728 0
mousedev 11776 1
ext3 141736 8
jbd 56760 1 ext3
mbcache 9252 2 ext2,ext3
ide_disk 18688 13
ide_generic 1152 0 [permanent]
via82cxxx 13820 0 [permanent]
trm290 4196 0 [permanent]
triflex 3680 0 [permanent]
slc90e66 5664 0 [permanent]
sis5513 16488 0 [permanent]
siimage 12448 0 [permanent]
serverworks 9032 0 [permanent]
sc1200 7296 0 [permanent]
rz1000 2400 0 [permanent]
piix 10340 0 [permanent]
pdc202xx_old 11168 0 [permanent]
opti621 4324 0 [permanent]
ns87415 4264 0 [permanent]
hpt366 20384 0 [permanent]
hpt34x 5152 0 [permanent]
generic 3808 0 [permanent]
cy82c693 4676 0 [permanent]
cs5530 5312 0 [permanent]
cs5520 4544 0 [permanent]
cmd64x 12028 0 [permanent]
atiixp 5904 0 [permanent]
amd74xx 14396 0 [permanent]
alim15x3 12268 0 [permanent]
aec62xx 7360 0 [permanent]
pdc202xx_new 9248 0 [permanent]
ide_core 130388 28 ide_cd,ide_disk,ide_generic,via82cxxx,trm290,triflex,slc90e66,sis5513,siimage,serverworks,sc1200,rz1000,piix,pdc202xx_old,opti621,ns87415,hpt366,hpt34x,generic,cy82c693,cs5530,cs5520,cmd64x,atiixp,amd74xx,alim15x3,aec62xx,pdc202xx_new
unix 27888 438
fbcon 39936 0
tileblit 2240 1 fbcon
font 8096 1 fbcon
bitblit 5920 1 fbcon
vesafb 7992 0
cfbcopyarea 3872 2 intelfb,vesafb
cfbimgblt 2816 2 intelfb,vesafb
cfbfillrect 4128 2 intelfb,vesafb
softcursor 2176 2 intelfb,vesafb
capability 4584 0
commoncap 6912 1 capability
I do not know what else could be relevent, so I do not include any other info, but I'm ready to send anything You may think useful to solve this.
Thanks in Advance
Mauro
P.S.: I'm posting from a web mailer, so it might be that the mail is sent in HTML, but I have no control over that. could someone confirm or refute this? Thanks again
Mauro
Reply to: