[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ATLAS debian cluster and Debian 5.0 Lenny?



Hi Andre et al.,

Carsten Aulbert wrote:
>> Please, if your team executes an upgrade at the Atlas cluster to Debian 5.0, tell the debian-publicity list about the experience.
> 
> Sure, we plan the upgrade within the next few months, we will keep you
> informed :)

Over past week we upgraded our "Atlas" cluster to Lenny. Also Steffen
(Grunewald, Dr., Cc-ed on this email) upgraded our sister cluster
"Morgane"[1] from Etch to Lenny (on more than 600 computers).

Overall I think everything went pretty smooth except that we used only a
single internal repository server and one FAI[2] server and were thus
limited to a single GBit ethernet link for our upgrade. On "Atlas" we
redid the compute node's partitions and now install a lot more software
than before thus a typical install took much longer than before (order
of 15 minutes per node). But as we can install many nodes in parallel it
was only a job of some hours spread over two days to get everything up
and running again.

So, what else do you want to know? In total we have about 1650 computers
running right now, a couple of nodes are experiencing some hardware
problems which we are attacking at the moment. But hardware wise we have
not progressed much further, except that we started to look into GPU
based systems as well...

Obviously, we like that many packages were upgraded and many more are
available now than back in Etch, but on the other hand we are already
starting to backport some packages from Squeeze as not all wanted/needed
packages made it into Lenny. There have been a few issues with FAI on
machines with multiple NICs in different networks, but that's a detail
we will sort out eventually together with Thomas Lange.

Just this Tuesday (2009-07-07) our latest scientific run started (LIGO
S6 and Virgo VSR2 [3],[4]) and we are currently getting data from the US
and Italy and will hopefully make a direct detection of gravitational
waves with this run which will last late into 2010. During this run we
will copy dozens of TBytes of data onto our file servers for analysis.

I hope this is already enough for the beginning, if not please feel free
to ask Henning, Steffen or myself.

Cheers

Carsten

[1]http://gw.aei.mpg.de/resources/computational-resources/merlin-morgane-dual-compute-cluster
[2]aptitude install fai-server; http://www.informatik.uni-koeln.de/fai/
[3]http://www.ligo.org/
[4]http://www.virgo.infn.it/


Reply to: