Re: hardware monitoring at the most basic level …
On Wed, 22 Aug 2012 15:18:31 -0400, Albretch Mueller wrote:
> How do you get periodic snapshots of your running hardware?
What? Is your hardware changing on every day basis? :-?
> My box started to shutdown by itself and I doubt it is related to
> overheating (in a random and plain physical way) so I changed it for
> another one because I didn’t have time for troubleshooting/fixing at
> this moment but then the same thing started to happen to the other box
Ah. I see :-)
It is very odd having a system that shutdowns by its own in two different
computers. Are both boxes sharing/using the same piece of hardware?
Anyway, a shutdown denotes a critical situation, that can be true or
somehow biased but there's something that instructs your system going
down and at a first glance, on the hardware side, I would point to the
CPU temperature or a bad power supply. On the software side, a bad/
incorrect measurement of sensor trip points can also make the system to
think it's hotter than it is in reality and thus triggering a system
shutdown.
> What I notice is that for no obvious apparent reason the CPU taxes to
> the max and the box starts revving wildly ~
When that happens, run "top" and sort the values by CPU load percentage
(pressing "C") to see what's the culprit.
> I use different live CDs based on linux debian and I am very careful
> in order to avoid the regular bs out there ~
> I would like to periodically test the underlying hardware as low as
> possible to the bare metal, because if something is messing with your OS
> it will be harder for you to notice anything ~
> Any best practices and tips you would share?
You can run a specialized LiveCD for these kind of tests (Inquisitor):
http://www.inquisitor.ru/about/index.html
Greetings,
--
Camaleón
Reply to: