[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian on Supercomputer



Hi all,

Andre Felipe Machado wrote:
> Hello, Dr. Carsten

Please, no Dr in this (private) communications - Dr is good for the
press (and other official duties) but not for talking among ourselves.

> Great news!
> Please, send more details [2] to the debian-publicity list enabling us
> to write about it.
> Do not worry about "style" or british english correctness. Sending the
> information is enough. There are PR writers and proof-readers at the
> debian-publicity team.
>>From the guidelines [2] "structure" section we will need:
> 
>         "
>         1- Who, where, when is using and chose Debian? 

OK, let's start. We are a group of scientists working at the Max Planck
Institute for Gravitational Physics in Hannover[0a], Germany, and trying
to detect gravitational waves directly. This we do with our friends and
colleagues within the LIGO Scientific community. For more background I
would suggest links [1], [2] and [3] for starters.

Most of the "computer" guys have been using Debian for years and thus it
was a natural choice for us to utilize Debian on a cluster.

>         
>         2- What was accomplished / built / offered? 
The ATLAS cluster consists of 1342 compute nodes (Intel Xeon 3220
quad-cores 2,4 GHz, 8 GB RAM, 500 GB Hitachi HDD, IPMI remote
management) along with 31 data servers (2x Intel Xeon E5345 2,33 GHz, 16
GB RAM, Areca 1261ML, 16x750 GB Hitachi HDD) plus 4 similar head nodes
with "only" 4 x 750 GB HDD. Those are all running Debian Etch with a few
modifications from our side (i.e. custom kernel, Condor queuing system, ...)

For further storage we are using 13 Sun Fire X4500 under Solaris 10.

The system was NOT build by one of the usual "Fortune 500" suspects, but
from a mid-sized company here in Germany calles Pyramid Computer GmbH[4].

One of the many specialties we have is the network from WovenSystems[5]
which is a hierarchical network fully non-blocking network. The EFX 1000
core switch features 144 10 Gb/s CX4 ports and connects currently to 32
TRX100 edge switches which feature 48 1 Gb/s ports and 4x10 Gb/s
uplinks. Also our X4500 are directly connected to the core switch.

Some benchmark numbers: Theoretical peak is beyong 50 TFlop/s, measured
speed in terms of top500.org linpack [6] is 32.8 TFlop/s. This would
place us in spot #34 worldwide, #11 in Europe and #4 in Germany on the
current list from November 2007.

Another special point: The system was relatively cheap, i.e. less than
EUR 1.8m (~ $2.8m).
>         
>         3- Why Debian was chosen? 
>         
The people designing, building and running the cluster (namely Dr
Henning Fehrmann and Dr Carsten Aulbert) have been using Debian for
years. Our brother and sister systems in Potsdam[0b], "Merlin" and
"Morgane" [7]are running Debian for years (one converted from RH 7.x at
some point) and the experience had been very, very good.

Debian features an extremely large set of packages, making it *the*
distro of choice for us keeping us out of the hassle to package needed
software ourselves.

Also Thomas Lange's FAI package[8] is extremely useful for automatic
deployment of Debian, for example without much tweaking using only two
hosts, we were able to reinstall the cluster in about 2.5 hours and were
only limited by those two servers' network connection.

(Note: > 2 weeks ago I would have written something about the very good
security support, given that the reaction to the OpenSSL stuff was very
good, I could still do, but in reality we don't need security updates
except for the head nodes, everything else is exposed only internally.)

>         4- What are the benefits of using Debian? 

Partly covered above, maybe, one should add
* the simplicity of creating own packages
* how repositories can be set-up easily (we use reprepro)
* using clean build environments (pbuilder et al.)
* and, of course, the superb packaging infrastructure in general (aka
dpkg/apt/aptitude)
>         
>         5- How Debian enabled such success / feature / business model,
>         etc? 

Hard point to say anything. our colleagues are mainly using CentOS 5 or
older Fedora versions, thus the cluster would also run with those,
however, possibly with more work for us.

Personally, I like community distros more since they offer more
long-term stability than a distro which is governed by the need of
releasing often to generate revenue. Although on the downside it would
be better for us to have a more settled release plan and/or some kind of
"stable and supported" backports.

> Please, write in english.

Sure :)

> As soon as we have a draft at new draft area [3], we will contact you
> asking for a review.
Please keep Elke Müller on the cc list for that, since she handles most
of our PR stuff here.

I hope this already gives some information, if you need more, e.g. bios
more details about the science, ... please let me know.

Cheers

Carsten

[0a] http://en.wikipedia.org/wiki/Hannover
[0b] http://en.wikipedia.org/wiki/Potsdam
[1] http://www.ligo.org/
[2] http://www.aei.mpg.de/hannover-en/66-contemporaryIssues/home/index.html
[3] http://www.einstein-online.info/en/
[4] http://www.pyramid.de/
[5] http://www.wovensystems.com/
[6] http://www.top500.org/
[7]
http://gw.aei.mpg.de/resources/computational-resources/merlin-morgane-dual-compute-cluster
[8] http://www.informatik.uni-koeln.de/fai/


Reply to: