Re: Beowulf in Bioinformatics

To: debian-beowulf@lists.debian.org
Subject: Re: Beowulf in Bioinformatics
From: ROGERIO DE CARVALHO BASTOS <rogeriobastos@dcc.ufba.br>
Date: Wed, 22 Jun 2011 13:09:45 -0300
Message-id: <[🔎] 20110622130945.31581m4ve9e96v2h@webmail.dcc.ufba.br>
In-reply-to: <[🔎] 4E00DD70.70304@gf7.com.br>
References: <[🔎] 4E00DD70.70304@gf7.com.br>

Hy Guilherme,

I'm graduating in Computer Science at UFBA and work with a cluster atInstitute of Physics. Maybe we could meet and talk about your cluster.


Citando Guilherme Rocha <guilherme@gf7.com.br>:

Hello all,
my name is Guilherme Rocha, Biotechnologist and a Debian user sincePotato, a stupid older user that think to be an advanced user, nomore than this.
Help sometimes to Debian l10n team to localize Debian to PT_BR.
I'm in charge to plan and build a cluster in our lab. Our lab isGenev - Laboratory of Genetics of Population and Molecular Evolution,
in the Federal University of Bahia - Brasil.
We already have some tasks being done in a Ubuntu Dell ServerMachine, but in a very slow procedure.In a Dell quadcore running Ubuntu this task (PALP analysis) delay 9days to be done.
We want to reduce this time drastically.

So we want to listen you, gurus, about the best practices in order to do it,
and also, to understand if we will have a significant time reductionwith our hardware, described below.
   * We first need to identify what we want/need. what is the (typical)
     problem you want to solve?
To use Debian Med in order to make philogenetics analysis, proteinmodeling, DNA alignment, genetics stuff...
Open Softwares like PALP, GAMGI, GARLIC, GDPC, PyMOL, Perl Primer, etc...

   * what software do you need for that, do you need a batch scheduler
     or do you
     have very few users which work at the same place and share the
     cluster without technical measures?
We'll have very few people, 10 I think. Not sure if the tasks needto be scheduledto be run. We are intended to use Debian Med, (med-bio meta-package)running in
a small size beowulf cluster. Almost 10 to 15 nodes.


   * think about the OS (Debian is a good choice here ;))

Yes, sure, Debian Med.  :)


   *   Think about the compute hardware, you probably need a login
     node, execute nodes and a file server, do you need many local
     cores or are the problems too large to fit into a few nodes?
We have very obsolete hardware, our server-node will be a pentium IV1,5GHz with 1GB RAM,with work-nodes from k6-500MHz (5 unities) to pentium III 266MHz (10unities), Thin Clients ATOM 1GHz
Question:

ThinClients with ATOM processor could be used?
The performance will be good enough?



 Then you need to look into networking
(Infiniband or high performance Ethernet), is the software susceptible to
latency and/or bandwidth available......
We have a 10/100 Switch. We are looking to the possibility toacquire a 100/100/1000 switch.
So the questions are:

  1. With this hardware, we will have a significant time reduction on
     these tasks with our hardware?
  2. Can we use thin clients to build a cluster?
  3. Some "Debian beowulf Way" method to be reviewed before start?
  4. Another type of cluster may be better than Beowulf to do it?
  5. Any Idea will be very welcome


cheers and long life to Debian,


--
Guilherme Rocha
GF7 Doc&  Systems - Soluções Tecnológicas
Home Page:http://www.gf7.com.br
Telefone: + 55 71 4062 9142
Mobile:   + 55 71 9279 0829




--

Rogerio de Carvalho Bastos

http://wiki.dcc.ufba.br/Main/RogerioBastos

Reply to:

Follow-Ups:
- Re: Beowulf in Bioinformatics
  - From: Guilherme Rocha <guilherme@gf7.com.br>

References:
- Beowulf in Bioinformatics
  - From: Guilherme Rocha <guilherme@gf7.com.br>

Prev by Date: R: Beowulf in Bioinformatics
Next by Date: Re: Beowulf in Bioinformatics
Previous by thread: Re: Beowulf in Bioinformatics
Next by thread: Re: Beowulf in Bioinformatics
Index(es):
- Date
- Thread