[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Dreamhost dumps Debian



Clint Byrum <spamaps@debian.org> writes:

> Dreamhost is a hosting company. It actually is quite possible that all
> 20,000 machines mentioned are unique snowflakes in this case. Though it
> is probably more likely that there at most 10,000 unique machines, with
> some customers having only one, but others having 3 or more.

I would suspect that the exposed interface to the client of systems for
which Dreamhost is maintaining the OS is something more like "Apache and
PHP 5.x."  Which means the machines aren't really unique snowflakes from
Dreamhost's perspective.  They may each have unique client data installed
on them, but that's not managed by Dreamhost.

Yes, it's definitely a huge hassle to communicate to all of those
customers to coordinate an upgrade from "Apache 2.0 and PHP 5.x" to
"Apache 2.2 and PHP 5.x+1".  But I'm dubious that it's really 20,000
unique hassles.  It's a hassle around a changed version of PHP with
clients, a hassle around a new version of Perl with different clients,
upgrades of a pile of backend database systems to a newer version of
MySQL, and so forth.  Each of those is real work, of course, but they
don't multiply by number of machines, or at least not obviously.

> How long does FAI take to make a new machine? If it is more than 30
> minutes then you need at least two FAI's going all the time to finish on
> time.

It depends on how many packages you want FAI to install, but about five
minutes.  But with that many machines, you'd obviously parallelize, not do
one at a time.  Most of the work happens on the system being bootstrapped.
You do want a fast local Debian mirror.

> I wasn't clear, I don't mean you'll do each one as a special snowflake
> in-place.  I mean, 20,000 machines is simply a lot of machines to
> manage. No matter what, upgrading or replacing the OS all within a 1
> year schedule that you do not control and cannot fully predict, is a big
> hassle.

Oh, sure.  But 20,000 machines is a lot of machines to manage for
*anything* that you do, and *everything* you have to do across 20,000
machines is a big hassle.  I don't think OS upgrades are a unique issue.
That's why, when you have 20,000 machines, you staff up accordingly.  Even
assuming a sysadmin to system ratio of 1000:1 (which would be excellent
and which would clearly imply a huge amount of homogeneity and automation),
that's an operational group of 20 full-time people.

The industry average ratio of systems to sysadmin is about 30:1 for
physical systems and 80:1 for virtual systems, with common variation from
10:1 to 500:1.  Dreamhost, as a hosting company instead of a more typical
IT organization running things like financials, is obviously going to be
pushing or exceeding the upper end of that, but I think 1000:1 is still
fair as a guess.

(Google is believed to be around 10,000:1, but that's after huge internal
investments in specialized automation and huge efforts on absolute
standardization and scaling, including tons of custom OS work and
invention of their own file systems, things that I doubt Dreamhost has
done.)

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>


Reply to: