(oops, forgot to CC the list)

On Tue, Mar 22, 2005 at 10:37:02PM -0600, jimstreit at northlans.com wrote:
> I want to take a few boxes, and run a couple common applications, like
> apache and mysql, so they load ballance across the boxes.  If I need
> more process power, I add more boxes.  All of the boxes would need to
> be able to access some type of common storage file system, fiber
> channel or iscsi.
> 
> If a box drops, the others just keep going.

Ok, I thought that might be what you want.  I think of a cluster as more
of a bunch of computers doing cpu workhorse problems.

I havn't had to do too much in terms of load balancing yet, but I have
read up on it plenty.  You're going to end up running master and slave
database boxes on the back end using replication.  Squid on the front
end doing caching and load balancing.  Alternatively/in addition you can
use round robin dns to distribute load across web servers.

You might find this inspiring: 
http://meta.wikimedia.org/wiki/Wikimedia_servers

Specifically, take a careful look at
http://meta.wikimedia.org/wiki/Image:Wikimedia-servers-2005-01-30.png

As far as common storage file system, that may or may not be necessary.
Your optimal solution will heavily depend on the type of web application
you run.  If your site has a lot of read only traffic, it simplifies
things because that's very easy to scale out, for example.

I'd also like to hear others' take on this topic, 

dan