[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: End of hypocrisy ?



On Mon, 4 Aug 2014 15:34:22 -0400
Tom H <tomh0665@gmail.com> wrote:

> On Mon, Aug 4, 2014 at 10:37 AM, Andrew McGlashan
> <andrew.mcglashan@affinityvision.com.au> wrote:
> > On 4/08/2014 11:32 PM, Tom H wrote:

> > Sure it counts, but if you have 1000s of servers, you likely have
> > many other considerations and you'll be pooling [at least] those
> > servers in a cluster type arrangement ... much lessening the need
> > for any machine to startup so quickly.
> 
> It's a nice theory. I'll give you an example (not fully technical but
> an example nonetheless; and I could give you others.
> 
> Suppose that you have a 16-node cluster, some patches were applied to
> the systems overnight, a mistake was made, and you have to correct
> this mistake on all of the systems during trading hours. Once you get
> all the OKs that are needed for this kind of emergency change, the
> head of the trading desk that uses that cluster calls you and says
> "I'm going to be on the line for as long as you're working on our
> system." So you fix one node, reboot it, make sure that it's back in
> the cluster and doing its job, and fix another, etc. You can be sure
> that everyone's happier that the systems boot quickly and that the
> cluster was running with 15 rather than 16 nodes for as few minutes as
> possible (because you can be sure that the fact that this cluster
> wasn't running at full capacity for X minutes will come up in
> managerial meetings, both in IT ones and in IT-Business ones).

If I understand correctly, these nodes are servers. Tell me one more
time, just so I understand, why do these boxes have so many daemons
that their boots take minutes? If some of the daemons fill a diagnostic
roll only, why not start them up after complete bootup? Perhaps with
Daemontools. And make sure your reverse DNS spins up early and well, so
that it doesn't cause delays on everything else.

Concurrency is wonderful when it works, hell on wheels when it doesn't.

SteveT

Steve Litt                *  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Reply to: