[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RE: [DRAFT FOR REVIEW] Sanger Institute Debian cluster 320 TB swap 1.5PB storage

 I have reviewed the text and made small modifications for it to
be more accurate.

I hope this is OK.


Phil Butcher

-----Original Message-----
From: Andre Felipe Machado [mailto:andremachado@techforce.com.br] 
Sent: 04 March 2008 02:09
To: debian-publicity@lists.debian.org
Cc: press.officer@sanger.ac.uk; avc@sanger.ac.uk; Phil Butcher
Subject: [DRAFT FOR REVIEW] Sanger Institute Debian cluster 320 TB swap
1.5PB storage

Please, review the attached draft looking for errors and improvements.
The most updated draft version is maintained, and rendered, at [0].
The target publishing date is March 6th, 2008, 12:00 GMT and corrections
should be submitted to the debian-publicity list [1] until that
deadline, please.
Andre Felipe Machado

[0] http://times.debian.net/1225
[1] http://lists.debian.org/debian-publicity
(anyone can post to the list, but only suscribers will receive msg)


 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

<h1>Wellcome Trust Sanger Institute, UK, uses a Debian cluster with 320 TB 
HP-SFS (Lustre) filesystem as part of it's 1.5 PB storage for human genome sequencing</h1>
<a href="http://www.sanger.ac.uk/";>Wellcome Trust Sanger Institute</a>, 
<a href="http://en.wikipedia.org/wiki/Hinxton";>Hinxton</a>, 
<a href="http://en.wikipedia.org/wiki/South_Cambridgeshire";>South Cambridgeshire</a>,
 UK, runs 
a 640+ cores Debian GNU / Linux cluster
 with 320 Terabytes of "live data", like a giant memory swap partition.
Each of the 27 new tehcnology robotic computorized genome sequencers generates 1 TB of image data each three days, at a 2 MB/s rate during a 2 hour run.
This amount of data needs to be "live" during the sequencing and initial analysis,
 and with the processing needs of the scientific software on 
<a href="http://www.debian.org/users/org/sangerinstitute.en.html";>the Debian GNU
 / Linux 640+ cores cluster</a>, the "swap-like" storage needs to be of 320TB.
Antony Cox, PhD, the Head of Sequencing Informatics, and Phil Butcher, the Head
 of IT at the institute, gave
 <a href="http://www.guardian.co.uk/technology/2008/feb/28/research.computing";>an interview</a>
 to The Guardian, presenting the Thousand Genome Project.
The institute is accurately sequencing one thousand individual human genomes
 to map all of their differences in 0,5% or more of the population sampled, and
 identify the places involved in the interactions between multiple gene bases
 that cause different conditions.
Given that the human DNA has 3 billion bases, and each sampled 2640 bases 
fragment must be sequenced between 11 and 30 times to factor out measurement
 errors, you are at one of the biggest computational efforts of today.
The project is unique not only because of dealing with 1.5 PB of storage, but
 for keeping 320 TB of "swap-like storage" for fast comparisons and calculations.
According to Butcher, genomics research is changing focus from the laboratory of glass tubes and moving to be more informatics focussed. The Sanger Institute started using Debian GNU / Linux when the world discovered how reliable and useful it can be. 
Now the institute has to compete with commercial organisations using Linux for system administrators able to manage large clusters with large-scale distributed filesystems.
You may read 
<a href="http://www.guardian.co.uk/technology/2008/feb/28/research.computing";>the interview</a>
 for more details.

<h2>About the Wellcome Trust Sanger Institute</h2>
<a href="http://www.sanger.ac.uk/";>The Wellcome Trust Sanger Institute</a>
 is one of the world's largest centres for DNA sequencing and analysis. It made
 the largest single contribution to the sequence of the 
<a href="http://www.sanger.ac.uk/HGP/";>Human Genome Project</a>,
 contributed approximately 25% of the 
 <a href="http://www.sanger.ac.uk/Projects/M_musculus/";>mouse genome sequence</a>,
  is finishing the 
  <a href="http://www.sanger.ac.uk/Projects/D_rerio/";>zebrafish genome sequence</a>
   as well as making contributions to other model organism sequences, such as 
   <a href="http://www.sanger.ac.uk/Projects/Fungi/";>yeasts</a> 
   and the nematode 
   <a href="http://www.sanger.ac.uk/Projects/C_elegans/";>C. elegans</a>.
 Institute researchers have also contributed to the sequence of more that 60
 finished genomes of bacterial pathogens, such as Salmonella typhi, TB, MRSA and
 Cdiff, as well as parasites such as those causing malaria, African
 trypanosomiasis and Leishmaniasis.
Investment in 
<a href="http://www.sanger.ac.uk/Info/News-releases/2007/071206.shtml";>new-technology sequencing</a>
 will dramatically increase the breadth and depth of genome analysis in humans,
 model organisms and pathogens.
You can contact Wellcome Trust Sanger Institute press Team 
<a href="http://www.sanger.ac.uk/Teams/Team97/";>here</a>.

<h2>About Debian Project</h2>

<p>Debian GNU / Linux is 
 <a href="http://www.debian.org/ports/";>one</a>
 of the 
 <a href="http://www.debian.org/intro/free";>free libre</a> operating systems
 (GNU/Linux, GNU/Hurd, GNU/NetBSD, GNU/kFreeBSD),
 developed by more than two thousand 
 <a href="http://asdfasdf.debian.net/~tar/bugstats/?8";>volunteers</a> from 
 <a href="http://www.debian.org/devel/developers.loc";>all over the world</a> who 
 <a href="http://www.debian.org/devel/";>collaborate</a> via the
 internet on the <a href="http://www.debian.org";>Debian Project</a>.</p>

<p>Debian's dedication to 
 <a href="http://www.debian.org/intro/free";>Free Libre Open Source Software</a>, its 
 <a href="http://www.debian.org/devel/constitution";>constitutional</a> 
 non-profit nature, its 
 <a href="http://vote.debian.org/";>open</a> and 
 <a href="http://en.wikipedia.org/wiki/Meritocracy";>meritocratic</a> 
 development model, 
 <a href="http://www.debian.org/intro/organization";>organization</a> 
 and social 
 <a href="http://www.techforce.com.br/index.php/news/linux_blog/scientific_study_about_debian_governance_and_organization";>
 governance</a> make it 
 <a href="http://www.debian.org/doc/manuals/project-history/";>a first</a> 
 among free libre operating system distributions.</p>

<p>The Debian project's key strengths are 
<a href="http://www.debian.org/devel/people";>its volunteer base</a>, 
 its dedication to the 
 <a href="http://www.debian.org/social_contract";>Debian Social Contract</a>, 
 and its <a href="http://wiki.debian.org/WhyDebianForDevelopers";>commitment</a> 
 to provide the best operating systems attainable, following a
 strict quality <a href="http://www.debian.org/doc/debian-policy";>policy</a>,
 working with an established
<a href="http://qa.debian.org/";>QA Team</a>.
You can 
<a href="http://www.debian.org/intro/help";>help</a>
 Debian Project without 
<a href="http://www.debian.org/devel/join";>joining</a>
 it and 
<a href="http://wiki.debian.org/DebianForNonCoderContributors";>even not being a programmer</a>,
 or being a development and or service 
<a href="http://www.debian.org/partners/";>partner</a> company or institution at the 
<a href="http://www.debian.org/partners/partners";>Debian Partner Program</a>,
 or simply making various 
<a href="http://www.debian.org/donations";>donations</a> to the Debian Project.
<p>Debian Project news, press releases and press coverage can be found 
from the official Debian wiki 
<a href="http://wiki.debian.org/News";>page</a>. PR contact at 
<a href="http://lists.debian.org/debian-publicity";>debian-publicity list</a>.

Reply to: