Interesting. Without diving into this too much due to time constraints...

> I took one of the affected drives out of the array, and ran a smart
long test on them (smart.sdd.txt, attached).  It shows a head flying
time of 6912h+43m+51.802s (around 288 days).

These should be under warranty still, most drives now still have at
least two years of coverage. WD and Seagate have worked well for RMA
on bad drives I've had before. Can't speak for Hitachi.

>All of the drives on the system are showing pre-fail and OldAge in the
smart reports.  I'm finding this difficult to believe, all of them
except sda are only about a year old.

Are you sure you're not just looking at the Type field? I ask because
they will always say that, which correlates to the type of SMART
reading.

Also, if you Google around for Seagate and seek errors, they are
notorious for not reporting these correctly in SMART. I would
recommend you reach out to the respective drive manufactures.

--
Jeremy MountainJohnson
Jeremy.MountainJohnson at gmail.com


On Tue, Feb 3, 2015 at 6:38 PM, Mark Mitchell
<mark.russel.mitchell at gmail.com> wrote:
> I'm running my first RAID array in a machine I built just short of a
> year ago.  I'm getting repeated messages in kern.log about ata resets
> on 2 ata channels.
>
> I took one of the affected drives out of the array, and ran a smart
> long test on them (smart.sdd.txt, attached).  It shows a head flying
> time of 6912h+43m+51.802s (around 288 days).
>
> All of the drives on the system are showing pre-fail and OldAge in the
> smart reports.  I'm finding this difficult to believe, all of them
> except sda are only about a year old.
>
> Do I really have to go out and buy a bunch of new 3TB drives?
>
> Here are some representative errors from kern.log;
>
> ==> /var/log/kern.log <==
> Feb  3 18:31:46 home-desktop kernel: [611894.092255] ata5.00:
> exception Emask 0x10 SAct 0x40000001 SErr 0x10200 action 0xe frozen
> Feb  3 18:31:46 home-desktop kernel: [611894.092259] ata5.00: irq_stat
> 0x00400000, PHY RDY changed
> Feb  3 18:31:46 home-desktop kernel: [611894.092262] ata5: SError: {
> Persist PHYRdyChg }
> Feb  3 18:31:46 home-desktop kernel: [611894.092265] ata5.00: failed
> command: READ FPDMA QUEUED
> Feb  3 18:31:46 home-desktop kernel: [611894.092269] ata5.00: cmd
> 60/a0:00:22:c0:0a/00:00:09:00:00/40 tag 0 ncq 81920 in
> Feb  3 18:31:46 home-desktop kernel: [611894.092269]          res
> 40/00:00:22:c0:0a/00:00:09:00:00/40 Emask 0x10 (ATA bus error)
> Feb  3 18:31:46 home-desktop kernel: [611894.092272] ata5.00: status: { DRDY }
> Feb  3 18:31:46 home-desktop kernel: [611894.092274] ata5.00: failed
> command: READ FPDMA QUEUED
> Feb  3 18:31:46 home-desktop kernel: [611894.092278] ata5.00: cmd
> 60/08:f0:72:f9:66/02:00:08:00:00/40 tag 30 ncq 266240 in
> Feb  3 18:31:46 home-desktop kernel: [611894.092278]          res
> 40/00:00:22:c0:0a/00:00:09:00:00/40 Emask 0x10 (ATA bus error)
> Feb  3 18:31:46 home-desktop kernel: [611894.092281] ata5.00: status: { DRDY }
> Feb  3 18:31:46 home-desktop kernel: [611894.092285] ata5: hard resetting link
> Feb  3 18:31:51 home-desktop kernel: [611899.409269] ata5: SATA link
> up 1.5 Gbps (SStatus 113 SControl 310)
> Feb  3 18:31:51 home-desktop kernel: [611899.435209] ata5.00:
> configured for UDMA/33
> Feb  3 18:31:51 home-desktop kernel: [611899.449242] ata5: EH complete
> Feb  3 18:32:17 home-desktop kernel: [611925.496050] ata6: exception
> Emask 0x10 SAct 0x0 SErr 0x10002 action 0xe frozen
> Feb  3 18:32:17 home-desktop kernel: [611925.496054] ata6: irq_stat
> 0x00400000, PHY RDY changed
> Feb  3 18:32:17 home-desktop kernel: [611925.496057] ata6: SError: {
> RecovComm PHYRdyChg }
> Feb  3 18:32:17 home-desktop kernel: [611925.496061] ata6: hard resetting link
> Feb  3 18:32:22 home-desktop kernel: [611930.406105] ata5: exception
> Emask 0x10 SAct 0x0 SErr 0x10200 action 0xe frozen
> Feb  3 18:32:22 home-desktop kernel: [611930.406109] ata5: irq_stat
> 0x00400000, PHY RDY changed
> Feb  3 18:32:22 home-desktop kernel: [611930.406111] ata5: SError: {
> Persist PHYRdyChg }
> Feb  3 18:32:22 home-desktop kernel: [611930.406116] ata5: hard resetting link
> Feb  3 18:32:24 home-desktop kernel: [611932.038938] ata6: SATA link
> up 1.5 Gbps (SStatus 113 SControl 310)
> Feb  3 18:32:28 home-desktop kernel: [611935.720865] ata5: SATA link
> up 1.5 Gbps (SStatus 113 SControl 310)
> Feb  3 18:32:28 home-desktop kernel: [611935.739014] ata5.00:
> configured for UDMA/33
> Feb  3 18:32:28 home-desktop kernel: [611935.752837] ata5: EH complete
> Feb  3 18:32:29 home-desktop kernel: [611937.036124] ata6.00: qc
> timeout (cmd 0xec)
> Feb  3 18:32:29 home-desktop kernel: [611937.036135] ata6.00: failed
> to IDENTIFY (I/O error, err_mask=0x4)
> Feb  3 18:32:29 home-desktop kernel: [611937.036137] ata6.00:
> revalidation failed (errno=-5)
> Feb  3 18:32:29 home-desktop kernel: [611937.036141] ata6: hard resetting link
> Feb  3 18:32:30 home-desktop kernel: [611937.527854] ata6: SATA link
> up 1.5 Gbps (SStatus 113 SControl 310)
> Feb  3 18:32:30 home-desktop kernel: [611937.528629] ata6.00: supports
> DRM functions and may not be fully accessible
> Feb  3 18:32:30 home-desktop kernel: [611937.529644] ata6.00: supports
> DRM functions and may not be fully accessible
> Feb  3 18:32:30 home-desktop kernel: [611937.529824] ata6.00:
> configured for UDMA/33
> Feb  3 18:32:30 home-desktop kernel: [611937.529997] ata6: EH complete
>
> Here's my drive layout;
> mark at home-desktop:~$ sudo lsblk
> NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
> sda           8:0    0 931.5G  0 disk
> ├─sda1        8:1    0    37M  0 part  /boot/efi
> ├─sda2        8:2    0  37.3G  0 part  [SWAP]
> ├─sda3        8:3    0 860.8G  0 part  /home
> └─sda4        8:4    0  33.5G  0 part  /
> sdb           8:16   0   2.7T  0 disk
> └─sdb1        8:17   0   2.7T  0 part
>   └─md0       9:0    0   8.2T  0 raid5
>     └─md0p1 259:0    0   8.2T  0 md    /srv/media
> sdc           8:32   0   2.7T  0 disk
> └─sdc1        8:33   0   2.7T  0 part
>   └─md0       9:0    0   8.2T  0 raid5
>     └─md0p1 259:0    0   8.2T  0 md    /srv/media
> sdd           8:48   0   2.7T  0 disk
> └─sdd1        8:49   0   2.7T  0 part
>   └─md0       9:0    0   8.2T  0 raid5
>     └─md0p1 259:0    0   8.2T  0 md    /srv/media
> sde           8:64   0   2.7T  0 disk
> └─sde1        8:65   0   2.7T  0 part
>   └─md0       9:0    0   8.2T  0 raid5
>     └─md0p1 259:0    0   8.2T  0 md    /srv/media
> sr0          11:0    1   4.3G  0 rom
>
> _______________________________________________
> TCLUG Mailing List - Minneapolis/St. Paul, Minnesota
> tclug-list at mn-linux.org
> http://mailman.mn-linux.org/mailman/listinfo/tclug-list
>