[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: continuous reboots in a two nodes cluster with heartbeat and pacemaker.



On 8/13/2012 3:37 AM, Mauro wrote:
>>
>> Are these controlled shutdowns?  Or are these hardware crash/reboots
>> that are occurring?
>>
>> If the former you should see syslog entries for the shutdown sequence.
>> If the latter, you won't see anything in the logs.  This would suggest
>> you've got a hardware problem, and not related to faulty NICs or switches.
>>
>> What kind of UPS are these machines powered from?  Have you checked the
>> UPS and verified they are functioning properly?  If you have a power
>> even and the UPS drop the load, the machines will reboot without a hint
>> in the logs as to what caused the reboot.
>>
>> Finally, what servers are theses?  Dell/HP/IBM or whitebox?  Memory
>> mismatch or simply bad memory can cause inexplicable reboots.  If the
>> machines are decent quality, they BIOS should log such events.
> 
> Servers are Hp proliant DL580G5.
> I'm afraid that I have hardware problems :-(

I don't think you have enough solid information yet to make that
assumption, unless you've discovered something you didn't share with us.

> The strange thing is that happens alternately in both nodes.

That being the case I'd suspect something other than server hardware.
To be sure, manually remove one node from the cluster and see how long
the remaining node runs without rebooting.  If it doesn't reboot at all,
that eliminates hardware as the fault point.

-- 
Stan



Reply to: