[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [DRAFT FOR REVIEW] Debian GNU/Linux 32.8 TFlops supercomputer at Max Planck Institute for Gravitational Physics

On Saturday 31 May 2008 01:59:56 Andre Felipe Machado wrote:
> The new success story draft version [0] is ready for review.

Well done: the new draft is much improved, in my opinion.

> There is a new title proposal.

My view is still that a short title is better.  So, I would vote for 
either "Debian GNU/Linux powers Max Planck Institute supercomputer" 
or "Debian GNU/Linux powers Max Planck Institute 32.8 TFlops supercomputer".

> You may add new ones and call for discussion at debian-publicity list.
> Also, the text was reorganized, paragraphs moved and sections created
> for clear view.
> These modifications must be evaluated.

I think it is still too long.  My view is that the main text (without the 
About ATLAS and About Debian sections) should be one page and a single 

Given that this is a "success story" I am not sure what value the "Debian is 
continuously evolving" section adds.  I would prefer to delete it. 

If anything more is needed it should be a summary -- what conclusions should 
people draw from this success story.  Something about the wide range of 
scalability for Debian, from tiny embedded systems to massive supercomputers, 
and the usefulness of community contributed tools like FAI.  But I think that 
can actually be covered quite well in the rest of the document.

In order to better explain what I mean, have included, below, a (text-only) 
version of my suggested re-arrangement and edits.  I have not made these 
changes in the Wiki because I think they are too large to make without prior 
discussion here.  [By the way, I *have* made some small changes in the Wiki 
where I think it improves the readability of the existing text.]

Debian GNU/Linux powers Max Planck Institute supercomputer

A team of scientists at the Max Planck Institute for Gravitational Physics  
have created Germany's 4th largest supercomputer by using Debian GNU/Linux.

The Observational Relativity and Cosmology Research Group is a team of 
scientists working at the Hannover Branch of the Max Planck Institute for 
Gravitational Physics (Albert Einstein Institute) in Hannover, Germany. Their 
goal is the direct detection of gravitational waves, which were first 
predicted by Albert Einstein. They are working with the friends and 
colleagues within the LIGO Scientific Community and VIRGO.

The massive computing effort necessary for this research is provided by a 
Debian GNU / Linux cluster of 1342 nodes called ATLAS. Using 10+ TB RAM, 
approximately 1.3 PB storage and a special network able to transfer almost 4 
days worth of DVD movies each second, the cluster achieves a measured 
performance of 32.8 TFlops. This performance places the ATLAS Debian GNU / 
Linux supercomputer at 4th place in Germany, 11th in Europe and 34th 
worldwide, at a cost of EUR 1.8m (~ US$ 2.8m).

The ATLAS Debian GNU / Linux cluster was designed, built and has been managed 
by Dr Henning Fehrmann and Dr Carsten Aulbert, who have been using Debian 
GNU / Linux for years.  ATLAS has smaller brother and sister systems in 
Potsdam, Germany: "Merlin" (1.3 Tflops) and "Morgane" (6 TFlops) -- also 
running Debian GNU / Linux and managed by Dr. Steffen Grunewald for many 
years; "the experience with them had been very, very good", according to Dr. 

"Thomas Lange's FAI package is extremely useful for automatic deployment of 
Debian [GNU / Linux]. For example, without much tweaking and using only two 
hosts, we were able to reinstall the cluster in about 2.5 hours and were only 
limited by those two servers' network connection.", said Dr. Aulbert.  Dr. 
Grunewald added, "FAI with its class model was a major breakthrough, in 
readability, functionality, and maintainability. There's no way back now."

Beyond FAI, there are other useful tools for massive scale installation, 
deployment and management of Debian GNU / Linux machines for various 
scenarios.  "Debian features an extremely large set of packages, making it 
THE distro of choice for keeping us out of the hassle to package needed 
software ourselves", explained Dr. Aulbert.

As additional benefits of using Debian GNU / Linux, he cited:

    *      the simplicity of creating own packages
    *      how repositories can be set-up easily (using the reprepro package)
    *      using clean build environments (pbuilder and similar packages)
    *      and, of course, the superb packaging infrastructure in general 
(dpkg, apt, aptitude, synaptic and many useful APT tools) 

By using Debian GNU / Linux at its clusters, the Observational Relativity and 
Cosmology Research Group reduced the amount of work needed on the hardware 
and software infrastructure, compared to other scientific clusters running on 
other distributions, allowing them to focus on their objective of detecting 
gravitational waves.

About the ATLAS cluster

The ATLAS cluster, linpack measured 32.8 TFlops and a theoretical peak of 
about 50 TFlops, consists of 1342 Supermicro computer nodes (Intel Xeon 3220 
quad-cores 2,4 GHz, 8 GB RAM, 500 GB Hitachi HDD, IPMI remote management) 
along with 31 data servers (2x Intel Xeon E5345 2,33 GHz, 16 GB RAM, Areca 
1261ML, 16x750 GB Hitachi HDD) plus 4 similar head nodes with 4 x 750 GB HDD. 
Those are all running Debian GNU / Linux 4.0 Etch with a few modifications 
like custom kernel and Condor queuing system. Additional storage space is 
supplied by 13 Sun Fire X4500 running Solaris 10. The system was built from 
off-the-shelf computers from a German company, Pyramid Computer GmbH.

One of the many special hardware components they have is the network from 
Woven Systems which is a hierarchical fully non-blocking network. The EFX 
1000 core switch features 144 10 Gb/s CX4 ports and connects currently to 32 
TRX100 edge switches which feature 48 1 Gb/s ports and 4x10 Gb/s uplinks, 
reaching 2880 Gb/s. Also their Sun Fire X4500 are directly connected to the 
core switch.

According to Dr. Grunewald, the Merlin Debian GNU / Linux Beowulf 180 node 
cluster (launched in 2002) initially ran on a rpm based distribution, but in 
2004 migrated to Debian GNU / Linux after the rpm distro vendor changed its 
licencing model. The total computing power of the 360 CPU cores has been 
estimated to be more than 1.3 Tflops peak; the data storage capacity is about 
20 TB mirrored.

The Morgane Debian GNU / Linux Beowulf cluster, consisting of 615 compute 
nodes, 15 storage nodes, and some head nodes, launched in December 2006. The 
total computing power of the 1230 CPU cores has been estimated to be more 
than 6 Tflops peak, the data storage capacity is about 100 TB.

About the Debian Project

Debian GNU / Linux is one of the free libre operating systems (GNU/Linux, 
GNU/Hurd, GNU/NetBSD, GNU/kFreeBSD), running 18733+ officially maintained 
packages on 15 hardware platforms, from cell phones and network devices to 
mainframes and supercomputers, developed by more than two thousand volunteers 
from all over the world who collaborate via the internet on the Debian 

Debian's dedication to Free Libre Open Source Software, its constitutional 
non-profit nature, its open and meritocratic development model, organization 
and social governance make it a first among free libre operating system 

The Debian project's key strengths are its volunteer base, its dedication to 
the Debian Social Contract and the Debian Constitution, and its commitment to 
provide the best operating systems attainable, following a strict quality 
policy, working with an established QA Team and helpful users reporting bugs, 
suggestions, exchanging ideas, and registering experiences.

You can help Debian Project without joining it and even not being a 
programmer, or being a development and or service partner company or 
institution at the Debian Partner Program, or simply making various donations 
to the Debian Project.

Debian Project news, press releases and press coverage can be found from the 
official Debian wiki page. PR contact at debian-publicity list.

Reply to: