[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: beowulf with afs

Jorge L. deLyra wrote:

of building a cluster with ~ 150 nodes and about 100-140 GB/node of
'free' hard drive space.
Man, that's a lot of disk. Just a wild idea I'm curious about: anybody
ever tried exporting disks from nodes via the nbd (network block device)
and assembling a huge raid0 (stripping) with the nbd's on the front end?
This might make a very fast disk even with IDE disks and traffic through
the network. This would be sort of a funny way to use a cluster...

That would be cool... especially if you could make it somewhat redundant, like RAID-4 or -5. Kinda like EMC-class storage on the cheap!

The latest Illuminator does something like this (0.4.1, sorry, not yet in unstable, I'm waiting for working mpich to go in with shared libs, the latest one doesn't build on ia-64 or hppa). Using PETSc distributed arrays, each node saves/loads the local part of the data to a separate file, optionally compressed. If the filename is on a local disk, then it's like a giant RAID-0.

Then you "playback" the data by doing distributed read from the local disks, distributed triangulate (of contour surfaces in 3-D), and render and rotate in Geomview on the head node. We've got 40 GB/node, which is 20 GB/CPU, on our latest cluster, and plan to use it all for time series data. (We're working on distributed rendering too, but that's still a few months off, maybe 0.6, or perhaps it'll be worthy of 0.9/beta by then.)

But you have to use PETSc distributed arrays for this to work. Also, it is RAID-0, if you lose one disk, you have an incomplete data set and it's pretty worthless. :-( Rewriting PETSc to be robust to node failure would take just a few man-years...

Regarding an earlier post, I kind of like running our cluster "diskless" (NFS-root) with local disks for scratch storage, because it greatly simplifies administration. In such a setup, using AFS/Coda/Intermezzo for, say, /home, /usr, /var, /tmp, /etc. (ha ha) might make sense, because the nodes would cache frequently-used files in the local scratch disks, cutting down on network traffic and making distributed jobs start faster than over NFS. Right?

I could even envision this as being useful for lots of client workstations, for which the centralized administration would let it scale to thousands of seats, and Coda/Intermezzo would make it somewhat robust to network failure. Just get a boatload of these $199 Lindows machines, and netboot them all... But that's not a Beowulf, so it's off-topic -- unless somebody harvests the machines' spare cycles...

I imagine Coda-root, etc. is a ways off; another approach might be to keep an initrd for root, and mount everything else on that, in order to run "diskless" without NFS... if one really wanted to.

Okay, enough idle musing for an evening.


-Adam P.

GPG fingerprint: D54D 1AEE B11C CE9B A02B  C5DD 526F 01E8 564E E4B6

Welcome to the best software in the world today cafe! <http://lyre.mit.edu/%7Epowell/The_Best_Stuff_In_The_World_Today_Cafe.ogg>

Reply to: