[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Аппаратный сбой (bug). Проц зависает. Hardware error. HELP!



В Sun, 21 Oct 2012 16:53:56 +0400 (MSK)
yuri.nefedov@gmail.com пишет:

> On Sun, 21 Oct 2012, alexander wrote:
> 
> > Ну чо тут получаеца.. Опять завис!(( После этого силой выключи
> > комп, и загрузился с лив дивиди дебьяна тестинга в режиме
> > восстановления. Посмотрел файлик /var/log/kern.log (или messages хз
> > это одно и то же). Там были такие логи:
> >
> > Oct 17 20:17:56 alexander kernel: [  620.236274] CPU1: Package power
> > limit notification (total events = 1)
> > Oct 17 20:17:56 alexander kernel: [  620.236279] CPU3: Package power
> > limit notificati on (total events = 1)
> > Oct 17 20:17:56 alexander kernel: [  620.236282] CPU2: Package power
> > limit notificati on (total events = 1)
> > Oct 17 20:17:56 alexander kernel: [  620.236285] CPU0: Package power
> > limit notificati on (total events = 1)
> > Oct 17 20:17:56 alexander kernel: [  620.236928] CPU3: Package power
> > limit normal
> > Oct 17 20:17:56 alexander kernel: [  620.236930] CPU2:
> > Package power limit normal
> > Oct 17 20:17:56 alexander kernel:
> > [  620.236933] CPU1: Package power limit normal
> > Oct 17 20:17:56 alexander kernel: [  620.236935] CPU0: Package power
> > limit normal Oct 17 20:22:35 alexander kernel: [  898.634647]
> > [Hardware Error]: Machine check events logged
> > Oct 17 20:45:02 alexander kernel: imklog 5.8.11,
> > log source = /proc/kmsg started.
> > Oct 17 20:45:02 alexander kernel:
> > [    0.000000] Initializing cgroup subsys cpuset
> > Oct 17 20:45:02 alexander kernel: [    0.000000] Initializing cgroup
> > subsys cpu Oct 17 20:45:02 alexander kernel: [    0.000000] Linux
> > version 3.2.0-3-amd64 (Debian
> >
> > Воть( Не могу понять эти логи. Ясно чо это аппаратная ошибка какая
> > то( Обратите вниманио на строку Hardware error.
> >
> > alexander@alexander:~$ uname -a
> > Linux alexander 3.2.0-3-amd64 #1 SMP Mon Jul 23 02:45:17 UTC 2012
> > x86_64 GNU/Linux
> >
> > Чо делать други мои?))
> >
> >
> 
>   Package power limit notification - это когда процессор (i5 или i7?)
>   входит в турбо режим. Ничего страшного.
у мя i7 ).

alexander@alexander:~$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 58
model name	: Intel(R) Core(TM) i7-3517U CPU @ 1.90GHz
stepping	: 9
microcode	: 0x12
cpu MHz		: 799.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl
vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb
xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase
smep erms bogomips	: 4788.04 clflush size	: 64
cache_alignment	: 64 address sizes	: 36 bits physical, 48
bits virtual power management:

ну и еще 3 штуки (ибо 4 Threads).

>   А вот то что сразу вслед за этим
>   [Hardware Error]: Machine check events logged
>   это плохо. Скорее всего перегрев или цепи питания
>   нагрузки не выдерживают.
>   Смотрите https://en.wikipedia.org/wiki/Machine_Check_Exception
еще не дыбал, но дыбану) спс)

> 
>   Попробуйте установить mcelog и в следующий раз увидите более
> подробную информацию. 
да установил уже) хех) но он чото не понятно чо логирует). Вот лог:

cat /var/log/mcelog

MCE 0
CPU 1 THERMAL EVENT TSC f99d4f19dbc 
TIME 1350822420 Sun Oct 21 23:27:00 2012
Processor 1 below trip temperature. Throttling disabled
STATUS c000000088250c00 MCGSTATUS 0
MCGCAP c07 APICID 1 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 58
mcelog: Unsupported new Family 6 Model 3a CPU: only decoding
architectural errors Hardware event. This is not a software error.
MCE 1
CPU 3 THERMAL EVENT TSC f99d4f1bb1d 
TIME 1350822420 Sun Oct 21 23:27:00 2012
Processor 3 below trip temperature. Throttling disabled
STATUS c000000088250c00 MCGSTATUS 0
MCGCAP c07 APICID 3 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 58
mcelog: Unsupported new Family 6 Model 3a CPU: only decoding
architectural errors Hardware event. This is not a software error.
MCE 2
CPU 0 THERMAL EVENT TSC f99d4f1d9e0 
TIME 1350822420 Sun Oct 21 23:27:00 2012
Processor 0 below trip temperature. Throttling disabled
STATUS c000000088250c00 MCGSTATUS 0
MCGCAP c07 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 58
mcelog: Unsupported new Family 6 Model 3a CPU: only decoding
architectural errors Hardware event. This is not a software error.
MCE 3
CPU 2 THERMAL EVENT TSC f99d4f1de06 
TIME 1350822420 Sun Oct 21 23:27:00 2012
Processor 2 below trip temperature. Throttling disabled
STATUS c000000088250c00 MCGSTATUS 0
MCGCAP c07 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 58
mcelog: Unsupported new Family 6 Model 3a CPU: only decoding
architectural errors Hardware event. This is not a software error.
MCE 4
CPU 1 THERMAL EVENT TSC f99d4f72841 
TIME 1350822420 Sun Oct 21 23:27:00 2012
Processor 1 below trip temperature. Throttling disabled
STATUS c000000088250800 MCGSTATUS 0
MCGCAP c07 APICID 1 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 58
mcelog: Unsupported new Family 6 Model 3a CPU: only decoding
architectural errors Hardware event. This is not a software error.
MCE 5
CPU 2 THERMAL EVENT TSC f99d4f739f3 
TIME 1350822420 Sun Oct 21 23:27:00 2012
Processor 2 below trip temperature. Throttling disabled
STATUS c000000088250800 MCGSTATUS 0
MCGCAP c07 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 58
mcelog: Unsupported new Family 6 Model 3a CPU: only decoding
architectural errors Hardware event. This is not a software error.
MCE 6
CPU 0 THERMAL EVENT TSC f99d4f75082 
TIME 1350822420 Sun Oct 21 23:27:00 2012
Processor 0 below trip temperature. Throttling disabled
STATUS c000000088250800 MCGSTATUS 0
MCGCAP c07 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 58
mcelog: Unsupported new Family 6 Model 3a CPU: only decoding
architectural errors Hardware event. This is not a software error.
MCE 7
CPU 3 THERMAL EVENT TSC f99d4f754bd 
TIME 1350822420 Sun Oct 21 23:27:00 2012
Processor 3 below trip temperature. Throttling disabled
STATUS c000000088250800 MCGSTATUS 0
MCGCAP c07 APICID 3 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 58

> Но вообще то, наверное, надо ограничивать
> максимальную частоту работы процессора.
а вот хз как это делать) хех)

> 
>   Ю.


Reply to: