Hi TCLUG!  This is my first post to the mailing list.  

I have a question about IBM Netfinity servers and the ServeRAID controller (aic7xxx series).  My company got a Netfinity 6000R (also known as an xSeries 350).  The machine has two Xeon processors and 3 34gb scsi drives in a Raid 5.  We are running Red Hat 7.3 and we patched the kernel to 2.4.18-4.

I have looked EVERYWHERE trying to find information on the following error.  It appeared a few times in posts to the kernel and hardware mailing lists at kernel.org, but there never seemed to be a definitive answer.  

Whenever we try to move a large number of files through the raid controller (general file moves to/from samba clients, or to a tape backup), we receive 10 to 30 i/o errors, like the following (sectors change):

Jun  5 08:09:07 gar kernel:  I/O error: dev 08:05, sector 49041320
Jun  5 08:09:07 gar kernel: SCSI disk error : host 3 channel 0 id 0 lun 0 return code = 70000
Jun  5 08:09:07 gar kernel:  I/O error: dev 08:05, sector 49041328
Jun  5 08:09:07 gar kernel: SCSI disk error : host 3 channel 0 id 0 lun 0 return code = 70000


Does anyone have any insight into this problem?  I tried turning the NMI_watchdog option off, tried resetting the raid controller's timer back to 256ms from its default 64ms, updated all the bios/driver levels on the motherboard, tape drive firmware, raid controller, etc.

It works fine when its just poking along, but when we try to move much data through it, it generates these errors.

Thanks,

Brent Friedman

PS - I'm a developer by trade, not a dyed-in-the-wool admin type.  But I have spent days trying to find a solution, and I'm not intimately familiar with scsi / raid error stuff.

------------------------------------------------
Join Excite! - http://www.excite.com
The most personalized portal on the Web!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://shadowknight.real-time.com/pipermail/tclug-list/attachments/20020605/7fef20a1/attachment.htm