>Date: Sat, 13 Aug 2005 12:19:03 -0500 (CDT)
>From: Mike Miller <mbmiller at taxa.epi.umn.edu>

>I want to do mostly scientific work plus email on a GNU/Linux server. 
>Apparently all the major distros have 64-bit versions available, which 
>helps.  Now I'm trying to decide which distro will be best for my uses.

Perhaps a more specialized Linux distribution would be more suitable
like ...

https://www.scientificlinux.org/

http://bioinformatics.org/biobrew/

http://dnalinux.com/

These are usually Live CDs, may be x86 only (as opposed to AMD64).
As far as I know they are being actively developed and are useful in
their specialty, but I have not used them.

I still think a cluster is what you need.  There are several Linux
cluster distributions that are available on Live CDs that are easy to
setup.  They should also have hard drive install options.

>I know that this is the kind of thing that starts "holy wars" and I really 
>am not trying to start any such thing!  ;-)  If someone can give me 
>comparative information about the distros that have been recommended, that 
>would help.  If you just want to say what you like or prefer, you can do 
>that, but that kind of information is probably not going to help me much. 
>If you can specify *why* you prefer one to another (and you actually have 
>experience with both!), that would be really valuable information.

One good reason to choose Debian over all others is the huge number of
well supported packages available for it, the deb package format and
apt-get.  I also use Fedora Core 4, but there aren't nearly as many
packages for it, the packages are in rpm format which is considered by
many to be inferior to deb and yum doesn't seem to work as well as
apt-get.  SUSE is also an excellent distribution and will get much
better do to the recent announcement of http://www.openSUSE.org/.

>So what are the biggest differences between these distros for a server 
>class machine?  Would these differences affect users much or mostly just 
>administrators?

The most important difference is how difficult it will be to install
missing software and configure the Linux distribution to do what you
need it to do.  There should not be much performance differences, since
the Linux kernel and most of the software will be the same, except that
it may be packaged differently.

>To clarify what I'll be doing:  Mostly numerical analysis (in statistical 
>genetics) using specialized packages but also using Octave, R, and other 
>standard GPL code.  I will have about a dozen users all running VNC 
>(Enterprise Edition) desktops on the server.  I want to run postfix and 
>apache and some kind of webmail server.  So, I won't be doing things like 
>playing DVDs or music or games, so any distro differences on that kind of 
>stuff is irrelevant.

Just use the distribution that includes most of the software you need.
How well a cluster will work depends on how fast the interconnect is and
how much information must flow between nodes?  These same issues will be
need to be addressed in a single system with several multi-core CPUs,
although the interconnect is bound to be much faster in a single
(non-cluster) system.  

Sincerely,

Ken Fuchs <kfuchs at winternet.com>