Are the drives ok? what does smartctl say about them? anything in
dmesg about it? You can try booting into knoppix, test the drives
using smart tools, then assembling the arrays (if the drives are ok)
and proceed from there.

also, having swap be a separate MD is no longer necessary, it can be a
logical volume with no loss of performance, see
https://lkml.org/lkml/2005/7/7/326. The same goes for the root MD, as
grub can boot off of nested LVM/MD with no problem (it can also boot
off of RAID5/6 now too). Now if only LVM could do sparse file systems
ala ZFS.




On Thu, Jul 14, 2016 at 12:32 PM, gregrwm <tclug1 at whitleymott.net> wrote:
> what's up with my mdadm?  note how it's taking 97% of CPU:
>
>>top - 11:40:14 up  1:16,  1 user,  load average: 1.02, 1.04, 0.90
>>Tasks: 187 total,   2 running, 185 sleeping,   0 stopped,   0 zombie
>>Cpu0  : 24.8%us, 75.2%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>> 0.0%st
>>Mem:   2928404k total,   604280k used,  2324124k free,    17800k buffers
>>Swap:        0k total,        0k used,        0k free,   338536k cached
>>
>>    PID  VIRT  RES  NI %CPU    TIME+  S COMMAND
>>   2078  5728 1700   0 97.3  24:32.14 R mdadm --monitor --scan -f
>> --pid-file=/var/run/mdadm/mdadm.pid
>>   3031  107m 3176   0  0.0   0:44.05 S bash
>>      1 19356 1600   0  0.0   0:31.68 S /sbin/init
>>   4718  873m 120m   0  0.0   0:21.58 S
>> /usr/lib/libreoffice/program/soffice.bin --splash-pipe=6
>>    429     0    0   0  0.7   0:10.92 S [md3_raid1]
>>    705     0    0   0  1.0   0:10.39 S [md0_raid1]
>>      4     0    0   0  0.7   0:09.36 S [ksoftirqd/0]
>>    709     0    0   0  0.3   0:06.49 S [md2_raid1]
>>    585 10656  752  -4  0.0   0:04.21 S /sbin/udevd -d
>
> mdstat doesn't drop any clues (md0 is boot, md2 is swap, md3 is lvm):
>>#  cat /proc/mdstat
>>Personalities : [raid1]
>>md0 : active raid1 sdb2[1] sda2[0]
>>      204788 blocks super 1.0 [2/2] [UU]
>>
>>md2 : active raid1 sdb1[1] sda1[0]
>>      10238904 blocks super 1.2 [2/2] [UU]
>>
>>md3 : active raid1 sda5[0] sdb5[1]
>>      153598908 blocks super 1.1 [2/2] [UU]
>>      bitmap: 0/2 pages [0KB], 65536KB chunk
>>
>>unused devices: <none>
>
> right after rebooting i swapoff -a, just to narrow in on what's up.  if i
> leave it idle mdadm is still quiet a few hours later.  but as soon as i fire
> up some app, like libreoffice, mdadm pins the CPU and stays that way.  it's
> way bogged down, but stuff still works.
>
> a couple days ago this box froze up.  this has been happening since.  so
> probably the md3 raid is corrupted?  shouldn't mdstat say something?  what
> do i look at next?  should i try removing sda5, and then sdb5?  or some
> other way to know which one is good?  or is that just wasting time, do i
> need to ditch md3&fire up a fresh raid?
>
> _______________________________________________
> TCLUG Mailing List - Minneapolis/St. Paul, Minnesota
> tclug-list at mn-linux.org
> http://mailman.mn-linux.org/mailman/listinfo/tclug-list
>