Well, it's just my opinion, but this sounds like an accident waiting to
happen.  Why are you going to be running Linux on your Suns?  Solaris is
much better suited to them.  If the software you need is on Solaris, why
not stay with Solaris?

Concerning the failing over: How are you planning on accomplishing that?  A
daemon on the backup machine that polls the main server and if it goes
down, the back up server mounts the disk and exports it?  How are you going
to get your clients to "fail over" to using a new server?  When ever you
need to remount an NFS filesystem, you need to kill off any processes that
are using it (umount -k).

I guess the first thing I thought of to solve your failing over problem
would be to use a SAN with a not-so-platform-specific clustered filesystem
like Dataplow or Tivoli SANergy (or GFS it supported anything but Linux).
Then you could have both your servers hooked to some fiber channel disk and
setup redundant metadata servers for your clustered FS.  Unfortunately, you
still have the problem of getting your NFS clients to remount the
filesystem(s) from the back up server... Unless you put every machine on
the SAN, but that you put anyone in the poor house...

A SAN requires a little more than pocket-change, however...

My $.02.

BTW: If you _do_ figure out how to get your NFS clients to "fail over" to a
new server, I'd _love_ to know how you did it.  I'm facing a similar
problem at the moment.

Gabe

On Wed, Jun 27, 2001 at 07:52:35PM -0500, Mike Hicks wrote:
> This is a Really Weird Configuration(tm), but I'll just see what you folks
> think of it..
> 
> At work, my boss and I are trying to get Linux going on our Sun boxes. 
> Well, we know that the OS itself works, but the applications are
> problematic.  For those who were watching previously when I asked around
> for application support on Linux/Sparc, I'll say that we have had luck
> with one package (out of 5) - SAS.  We haven't actually tried it yet, but
> I talked to a rep there who said it would work on our systems.
> 
> Since only some software is available for Linux/Sparc, we're stuck with
> Solaris, at least on one or two systems.
> 
> My boss has long had the idea that we should have two servers connected to
> our A1000 array where our users' home directories are stored, so that we
> can fail over from one server to the other in our NFS configuration.  Of
> course, since we'd like to get at least one of these systems running
> Linux, things get really interesting.
> 
> I've successfully attached two systems together, and had them both reading
> the array at the same time (though, in normal operation, I'm expecting
> only one system would have the array mounted at a time).
> 
> I think we can get this to work, but I'm curious about a few things:
> 
> * How similar/different are Linux and Solaris NFS servers?  Will a client
>   be able to failover to a Linux server from a Solaris server without
>   going nuts?
> 
> * The Linux UFS driver handles the UFS variants in slightly different
>   ways.  How well can the Linux driver read/write Sun's implementation? 
>   Am I going to end up with a corrupted filesystem if I dare to try it?
> 
> * On my test setup, the Solaris box sees the A1000 array at SCSI ID 0, but
>   the Linux systems sees it as SCSI ID 7.  Anyone have an explanation?



-- 
------------------------------------------------------------------------
Gabe Turner                                             gabe at msi.umn.edu
SGI Origin Systems Administrator,
University of Minnesota Supercomputing Institute
 for Digital Simulation and Advanced Computation         www.msi.umn.edu
------------------------------------------------------------------------