At 09:36 AM 3/17/2004 -0600, Bogen, Patrick wrote:
We have a cluster here at my work, and whenever the power goes out it turns itself on, but then it has to be rebooted manually, because the computation nodes came up before (or at the same time as) the head node. Is there some software mechanism that can be used to ensure this doesn't happen?
We use to ensure boot delays with hacks like having compute nodes timeout on spurious NFS mounts or other init trickery. You could fsck upon reboots. That would do it. There is also a wait state you can insert into your init scripts somewhere... let me see if I can man that up...