[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: HPC - Kerrighed + others



Hi Will,

In addition to Camaleón's suggestions, I can answer a couple of your questions.

First, a few links that might get you on your way to the land of HPC:

The Debian wiki has a wealth of information:
http://wiki.debian.org/HighPerformanceComputing

There is also an article on debianadmin.com (which I suspect you may have found):
http://www.debianadmin.com/how-to-set-up-a-high-performance-cluster-hpc-using-debian-lenny-and-kerrighed.html

I'm personally not familiar with Kerrighed, but there is also PelicanHPC, which I was looking at when I worked in an HPC environment. Note that RedHat (CentOS) owns the cluster space, because most businesses/entities that have the money to throw at that much hardware also want RHEL licenses, and those that don't use CentOS. Thus, there may be apps out there that you may have to compile on your own or only come as .rpms. (I thought Sun Grid Engine was an example of this, but was pleasantly surprised that it is in the Debian repos).

As for the second part of your question, it depends on how you want to do it. We used Sun Grid Engine at NIH, not necessarily because it was better, but more because it was in place. Whether because the admins long ago were more skilled in it than, say, torque or any of the other schedulers or because it was better at one point (or even now), I'm not sure.

You are going to want to set up at least one head node (I generally recommend a lower-resource machine for the heads, to dissuade devs from running their apps directly on it). One thing we were starting to experiment with prior to my leaving was running VMware (or KVM or Xen) on two physical boxes and having a relatively low-resource image as the head node and using VMotion or equivalent on the head node image. That way, unlike a traditional dual-physical-head-node configuration, since the image is checkpointed between the VMware servers, you are less likely to lose work in the event of a hardware failure or a runaway job depriving the head node of memory. It is significantly more expensive, given that you have to pay for the hardware as well as the VMware license (Does KVM do checkpointing similar to vmware?). However, for us it was worth it, since jobs on the cluster would run for days, weeks and months. You might consider whether it would be worth it for your particular use case...I've never used Matlab, so I can't comment on that. We were using a lot of (very old) bioinformatics software, which held challenges of its own.

Finally, as for as for running GUIs, you can ssh -X into the head node, and fire off your app, and the GUI should appear on your local machine. However, depending on the scheduler, you could run the scheduler on the head node and use a separate machine to model your data (e.g. in the Matlab GUI)...If I understand the Matlab concept...

Hope that helped.
--b

On Wed, Jul 20, 2011 at 12:02 AM, wjryder <wjryder@me.com> wrote:
Hi,

I am looking at options to setup a HPC. The system will have 6 Blades to start and more added in the future.

It will run Matlab, IDL and other similar programs. Which scheduling/load balancing software would be best use ?

I thought that Kerrighed looked interesting as it should make the blades appear as a single computer.
Does anyone have any experience of this and how would it work when running programs with GUIs ?

Thanks

Will





Reply to: