Quoting Mike Miller <mbmiller at taxa.epi.umn.edu>:

> We are thinking about putting together a cluster of maybe 10 machines,
> presumably using GNU/Linux.  Do any of you have experience with this?
>
> Some of the things I'm wondering about include the appropriate
> configuration of machines -- isn't it better in terms of cost/benefit to
> buy fewer dual quad-core machines than more single CPU machines,
> especially if the jobs are not very memory-instensive?
>
> We certainly want to use shared disks, but is there any problem with
> booting all the computers from the same network drive?  That seems like a
> good idea to me rather than to have separate HDDs in the machines, but I'm
> not sure how it is done.
>
> What free software is available for managing jobs, e.g., batch queuing?
>
> FYI ... The idea is to use these machines for our genetic analyses --
> maybe 600,000 SNPs on 7,500 people, but this mostly consists of running
> one SNP at a time on some collection of traits.  I don't think the memory
> requirements are too great unless we try to load a lot of the data at
> once.
>
> Mike
>

You might want to take a look at what the folks at UW are doing with Condor:

http://www.cs.wisc.edu/condor/

That being said, I'm sure you're not the only one at the U whose  
looking to do this, I have to imagine there are quite a few folks at  
it already. Are there any internal peer groups or other ways to  
collaborate with campus folks?

Josh