> But I will also say that the structure of the linux
> cluster is also different than the SGI structure, SGI uses ccNUMA, this
> allows for memory and cpu access across the entire cluster at any point in
> time for any process, no shared memory limitations, etc.  
	but that's a hardware issue. no reason you couldn't get Linux to
bang on that hardware. In fact, I'm sure SGI already is doing so. I know SGI
has booted 64-processor Origins with Linux (tho I've only seen the dmesg
from a 32-proc machine; was posted to slashdot a few months ago); and I've
heard rumors of them trying to boot it on 128-proc machines.
	AFAIK, that involves being able to run on a ccNUMA architecture.

> the issue is that parallel
> linux clusters have a limited scalability in comparison to SGI (I don't
> remember numbers, something like 64 procs?), 
	depends how it's done. you can have clusters of 1000s of machines on
a network. (just look at Fermilab's proposed 2000-node cluster). the
difficulty comes in routing traffic among those nodes. (the best solution
I've seen uses multiple network interfaces per node, and a really
complicated dynamic routing alrorithm).
	On a single node, Linux is currently limited to 32 processors on
32-bit architectures, and 64 processors on 64-bit architectures; but I
expect this limitation to be lifted early in the 2.5 series. from what I
hear, it's just the size of some variables that limits this. I fully expect
that Linux may be capable of booting on a 1024-proc machine by 2.6 time. (no
guarantees...). I'll almost guarantee that it won't run very efficiently on
that many processors, tho.  (probably spend most of its time trying to
figure out what to do next).  maybe by the next version after 2.6, it'll
start to compete with the big unices, for scaleability. remains to be seen.
	I'd really like to see some side-by-side comparisons of Sparc
Linux/Solaris; and Alpha Linux/Tru64; and MIPS Linux/IRIX; for varying
numbers of processors. With large numbers of processors (and processes),
Linux will lose. However, for small numbers of procs and processes; I think
there's a chance it may actually win in some cases. I think we need
something like the Mindcraft tests (the second ones), to show us where we
stand. (and some other tests by other groups as well... the german "c't"
magazine had a test similar to mindcraft's; and Linux blew NT out of the
water). it won't put the controversy to rest; but it will show the Linux
community (quite publicly) where they need to improve. 

> with Mosix and the sharing of
> multiple CPUs you get out of that hole somewhat, but you dig yourself
> another with the limitation on shared memory processes outside of the
> originating node.  This is well known and is a limitation I've hit, myself.
	well, that's a Mosix issue, and I don't know enough to comment about
it. I'm sure they're working on it, tho. :)

> So, I say again, it's a monetary issue, no matter what "shading" is put on
> it, and that's all I would like is the understanding that in comparison, in
> some areas, pound for pound, cpu for cpu, proprietary things just work
> better, no ifs ands or buts, they just do.  
	and in places where the extra performace justifies the extra
magnitude of cost; they'll always have a place.

let me try to synthesize all of what I mean:
-- on the same hardware as the commercial *nixes; Linux doesn't yet scale as
well as they do.
-- within 2 Linux generations (maybe 5 years), we're likely to see Linux
performing as well on big hardware, as the commercial *nixes do now. (tho as
someone said, they are moving targets).
-- big iron costs a lot more than a bunch of x86 boxes. In some cases, it
offers a lot more scaleability, tho (notably, for things that aren't readily
parallelizable).
-- some people are willing to pay for that extra bit of performance that big
iron offers. the question then becomes, what OS will run it? I think that
Linux will eventually be able to take over a lot of the hardware that
commercial *nixes run now.

Carl Soderstrom
-- 
Network Engineer
Real-Time Enterprises
(952) 943-8700