[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: was Four people troll - now meandering off elsewhere



Scott Ferguson wrote:
On 04/03/14 08:41, Miles Fidelman wrote:
Scott Ferguson wrote:
On 03/03/14 23:28, Fred Wilson wrote:
On Mon, 03 Mar 2014 12:52:40 +1100 Scott Ferguson
<scott.ferguson.debian.user@gmail.com> wrote:

Which is fine for you, and I can understand and appreciate
that, for my own personal computers my sentiments are similar.
  However my business purposes involve meeting SLAs so reboots
once or twice a year can cost a lot of money - so in those
circumstances a few minutes makes a lot of difference. Perhaps
  that's not something you care about - or it's just convenient
  to ignore until your bank/phone/stockbroker/shopping is
interrupted as a result.
http://en.wikipedia.org/wiki/Service-level_agreement
http://en.wikipedia.org/wiki/High_availability#Percentage_calculation





When you pay for a five nines SLA, perhaps for your business web
site hosting, or what your bank/business pays for their trading
platform that means we must be offline a *total* of less than 5
and a half minutes *a year*. That's begin reboot to all services
restarted. Failure to do so results in penalties that can *very*
quickly exceed the annual support contract. While a great deal of
effort and planning goes into *shifting loads* so that reboots
don't affect production - things don't always work to plan, so good
plans allow for that. Meaning systems must be designed to reboot in
less than the allowed downtime - with a safety margin. If we can
shave a few seconds off reboot time we can shave a large amount off
the support contract price, with the possibility that those savings
are passed on to the consumer.
Anybody who is counting on a fast reboot to maintain a 5 nines SLA
is simply nuts.

Agreed.
What do you call people who don't read what they reply to? The
same thing you call people who "know" about areas of technology they
have had no experience in?

that's what redundancy and high-availability configurations are for.
Yes. And they get tested, as do all the components. Which means
rebooting is something that doesn't just happen on production systems.
It all adds up to lost productivity. To paraphrase Oliphant
"extrapolation is not a human strength"

Personally, I'm a lot more worried about what's going to break when
we move to Jessie and systemd - and all those things I might have to
  reconfigure.  That involves serious time, effort, and dollars.  And
  that's before the things that will break intermittently.  I still
shudder every time I think of the impact udev had on our operations,
  before we got the subtleties figure out. (Note: at the moment "we" =
  "me" and sleepless nights that impact other work.)

Anybody who is counting on stability and not running stable is, I won't
say nuts, but I would say "challenged", and sure to have an
"interesting" time. :)  That said your use cases are unlikely to be mine
- and I don't know what I don't know, so I won't presume to dictate your
needs.

We don't move to stable until it's been stable at least a year (so the
move to Wheezy has only been recent, in many cases we still run
old-stable) - anything less give insufficient time for testing. But the
developers need at least two years lead time before we can even sit down
and discuss support contracts that entail more substance than trying to
nail snot to the wall.



Well... just to be clear - we still run old_stable (and earlier) on a lot of stuff. "If it ain't broke, don't fix it" remains good practice. I generally migrate stuff when security patches stop being available.

Miles

--
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra


Reply to: