just looking at one of the smart reports i do not see anything
particularly odd, some numbers are high but nothing that i associate
with immediate failure, on the other hand there is something a lot
more ominous:

==> WARNING: A firmware update for this drive may be available,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/223651en

there is a very good chance that a bad firmware is masking the actual
problem, i have worked on at least two seagate drives( and i am not
picking on seagate here, IBM, WD and others have had their own share
of firmware issues) where they started exhibiting signs of bad
sectors, but SMART was not catching on and passed the drive with
flying colors. In one occasion, upon updating the firmware SMART
starts complaining about a drive whose failure is "imminent", which of
course was true; but with the updated firmware i was also able to work
with the drive to recover the data.




On Tue, Feb 3, 2015 at 6:38 PM, Mark Mitchell
<mark.russel.mitchell at gmail.com> wrote:
> I'm running my first RAID array in a machine I built just short of a
> year ago.  I'm getting repeated messages in kern.log about ata resets
> on 2 ata channels.
>
> I took one of the affected drives out of the array, and ran a smart
> long test on them (smart.sdd.txt, attached).  It shows a head flying
> time of 6912h+43m+51.802s (around 288 days).
>
> All of the drives on the system are showing pre-fail and OldAge in the
> smart reports.  I'm finding this difficult to believe, all of them
> except sda are only about a year old.
>
> Do I really have to go out and buy a bunch of new 3TB drives?
>
> Here are some representative errors from kern.log;
>
> ==> /var/log/kern.log <==
> Feb  3 18:31:46 home-desktop kernel: [611894.092255] ata5.00:
> exception Emask 0x10 SAct 0x40000001 SErr 0x10200 action 0xe frozen
> Feb  3 18:31:46 home-desktop kernel: [611894.092259] ata5.00: irq_stat
> 0x00400000, PHY RDY changed
> Feb  3 18:31:46 home-desktop kernel: [611894.092262] ata5: SError: {
> Persist PHYRdyChg }
> Feb  3 18:31:46 home-desktop kernel: [611894.092265] ata5.00: failed
> command: READ FPDMA QUEUED
> Feb  3 18:31:46 home-desktop kernel: [611894.092269] ata5.00: cmd
> 60/a0:00:22:c0:0a/00:00:09:00:00/40 tag 0 ncq 81920 in
> Feb  3 18:31:46 home-desktop kernel: [611894.092269]          res
> 40/00:00:22:c0:0a/00:00:09:00:00/40 Emask 0x10 (ATA bus error)
> Feb  3 18:31:46 home-desktop kernel: [611894.092272] ata5.00: status: { DRDY }
> Feb  3 18:31:46 home-desktop kernel: [611894.092274] ata5.00: failed
> command: READ FPDMA QUEUED
> Feb  3 18:31:46 home-desktop kernel: [611894.092278] ata5.00: cmd
> 60/08:f0:72:f9:66/02:00:08:00:00/40 tag 30 ncq 266240 in
> Feb  3 18:31:46 home-desktop kernel: [611894.092278]          res
> 40/00:00:22:c0:0a/00:00:09:00:00/40 Emask 0x10 (ATA bus error)
> Feb  3 18:31:46 home-desktop kernel: [611894.092281] ata5.00: status: { DRDY }
> Feb  3 18:31:46 home-desktop kernel: [611894.092285] ata5: hard resetting link
> Feb  3 18:31:51 home-desktop kernel: [611899.409269] ata5: SATA link
> up 1.5 Gbps (SStatus 113 SControl 310)
> Feb  3 18:31:51 home-desktop kernel: [611899.435209] ata5.00:
> configured for UDMA/33
> Feb  3 18:31:51 home-desktop kernel: [611899.449242] ata5: EH complete
> Feb  3 18:32:17 home-desktop kernel: [611925.496050] ata6: exception
> Emask 0x10 SAct 0x0 SErr 0x10002 action 0xe frozen
> Feb  3 18:32:17 home-desktop kernel: [611925.496054] ata6: irq_stat
> 0x00400000, PHY RDY changed
> Feb  3 18:32:17 home-desktop kernel: [611925.496057] ata6: SError: {
> RecovComm PHYRdyChg }
> Feb  3 18:32:17 home-desktop kernel: [611925.496061] ata6: hard resetting link
> Feb  3 18:32:22 home-desktop kernel: [611930.406105] ata5: exception
> Emask 0x10 SAct 0x0 SErr 0x10200 action 0xe frozen
> Feb  3 18:32:22 home-desktop kernel: [611930.406109] ata5: irq_stat
> 0x00400000, PHY RDY changed
> Feb  3 18:32:22 home-desktop kernel: [611930.406111] ata5: SError: {
> Persist PHYRdyChg }
> Feb  3 18:32:22 home-desktop kernel: [611930.406116] ata5: hard resetting link
> Feb  3 18:32:24 home-desktop kernel: [611932.038938] ata6: SATA link
> up 1.5 Gbps (SStatus 113 SControl 310)
> Feb  3 18:32:28 home-desktop kernel: [611935.720865] ata5: SATA link
> up 1.5 Gbps (SStatus 113 SControl 310)
> Feb  3 18:32:28 home-desktop kernel: [611935.739014] ata5.00:
> configured for UDMA/33
> Feb  3 18:32:28 home-desktop kernel: [611935.752837] ata5: EH complete
> Feb  3 18:32:29 home-desktop kernel: [611937.036124] ata6.00: qc
> timeout (cmd 0xec)
> Feb  3 18:32:29 home-desktop kernel: [611937.036135] ata6.00: failed
> to IDENTIFY (I/O error, err_mask=0x4)
> Feb  3 18:32:29 home-desktop kernel: [611937.036137] ata6.00:
> revalidation failed (errno=-5)
> Feb  3 18:32:29 home-desktop kernel: [611937.036141] ata6: hard resetting link
> Feb  3 18:32:30 home-desktop kernel: [611937.527854] ata6: SATA link
> up 1.5 Gbps (SStatus 113 SControl 310)
> Feb  3 18:32:30 home-desktop kernel: [611937.528629] ata6.00: supports
> DRM functions and may not be fully accessible
> Feb  3 18:32:30 home-desktop kernel: [611937.529644] ata6.00: supports
> DRM functions and may not be fully accessible
> Feb  3 18:32:30 home-desktop kernel: [611937.529824] ata6.00:
> configured for UDMA/33
> Feb  3 18:32:30 home-desktop kernel: [611937.529997] ata6: EH complete
>
> Here's my drive layout;
> mark at home-desktop:~$ sudo lsblk
> NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
> sda           8:0    0 931.5G  0 disk
> ├─sda1        8:1    0    37M  0 part  /boot/efi
> ├─sda2        8:2    0  37.3G  0 part  [SWAP]
> ├─sda3        8:3    0 860.8G  0 part  /home
> └─sda4        8:4    0  33.5G  0 part  /
> sdb           8:16   0   2.7T  0 disk
> └─sdb1        8:17   0   2.7T  0 part
>   └─md0       9:0    0   8.2T  0 raid5
>     └─md0p1 259:0    0   8.2T  0 md    /srv/media
> sdc           8:32   0   2.7T  0 disk
> └─sdc1        8:33   0   2.7T  0 part
>   └─md0       9:0    0   8.2T  0 raid5
>     └─md0p1 259:0    0   8.2T  0 md    /srv/media
> sdd           8:48   0   2.7T  0 disk
> └─sdd1        8:49   0   2.7T  0 part
>   └─md0       9:0    0   8.2T  0 raid5
>     └─md0p1 259:0    0   8.2T  0 md    /srv/media
> sde           8:64   0   2.7T  0 disk
> └─sde1        8:65   0   2.7T  0 part
>   └─md0       9:0    0   8.2T  0 raid5
>     └─md0p1 259:0    0   8.2T  0 md    /srv/media
> sr0          11:0    1   4.3G  0 rom
>
> _______________________________________________
> TCLUG Mailing List - Minneapolis/St. Paul, Minnesota
> tclug-list at mn-linux.org
> http://mailman.mn-linux.org/mailman/listinfo/tclug-list
>