[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: got a mdadm puzzler



On Friday, 11 March 2022 13:11:14 EST Andy Smith wrote:
> Hello,
> 
> On Thu, Mar 10, 2022 at 07:18:56AM -0500, gene heskett wrote:
> > 2. I've had since the last of about 20 installs of bullseye, a very
> > early boot message about ata6 at the 10 and 20 second marks of the
> > reboot  IF it was not a full powerdown reboot.
> 
> Did you not at any point think that letting us know what the exact
> error message was would be useful here?
> 
IF that error message ever made it to the logs, I don't know which one. 
Its output to the screen, but I'll grep syslog for ata6.  Found some, 
first instance was reboot, 2nd instance was bootup from a full powerdown 
of about 5 seconds:

Mar  8 15:55:01 coyote kernel: [    0.699889] ata6: SATA max UDMA/133 
abar m2048@0xdf34b000 port 0xdf34b380 irq 126
Mar  8 15:55:01 coyote kernel: [    6.071345] ata6: link is slow to 
respond, please be patient (ready=0)
Mar  8 15:55:01 coyote kernel: [   10.759342] ata6: COMRESET failed 
(errno=-16)
Mar  8 15:55:01 coyote kernel: [   16.111352] ata6: link is slow to 
respond, please be patient (ready=0)
Mar  8 15:55:01 coyote kernel: [   20.791354] ata6: COMRESET failed 
(errno=-16)
Mar  8 15:55:01 coyote kernel: [   26.143351] ata6: link is slow to 
respond, please be patient (ready=0)
Mar  8 15:55:01 coyote kernel: [   55.843357] ata6: COMRESET failed 
(errno=-16)
Mar  8 15:55:01 coyote kernel: [   55.843417] ata6: limiting SATA link 
speed to 3.0 Gbps
Mar  8 15:55:01 coyote kernel: [   60.895357] ata6: COMRESET failed 
(errno=-16)
Mar  8 15:55:01 coyote kernel: [   60.895417] ata6: reset failed, giving 
up

2nd instance, after power down

Mar  8 15:56:06 coyote kernel: [    0.684458] ata6: SATA max UDMA/133 
abar m2048@0xdf34b000 port 0xdf34b380 irq 127
Mar  8 15:56:06 coyote kernel: [    0.998471] ata6: SATA link up 6.0 Gbps 
(SStatus 133 SControl 300)
Mar  8 15:56:06 coyote kernel: [    0.999415] ata6.00: ACPI cmd ef/
10:06:00:00:00:00 (SET FEATURES) succeeded
Mar  8 15:56:06 coyote kernel: [    0.999416] ata6.00: ACPI cmd 
f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
Mar  8 15:56:06 coyote kernel: [    0.999417] ata6.00: ACPI cmd b1/
c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
Mar  8 15:56:06 coyote kernel: [    1.000270] ata6.00: ATA-9: 
ST2000DM001-1ER164, CC25, max UDMA/133
Mar  8 15:56:06 coyote kernel: [    1.000272] ata6.00: 3907029168 
sectors, multi 16: LBA48 NCQ (depth 32), AA
Mar  8 15:56:06 coyote kernel: [    1.001605] ata6.00: ACPI cmd ef/
10:06:00:00:00:00 (SET FEATURES) succeeded
Mar  8 15:56:06 coyote kernel: [    1.001606] ata6.00: ACPI cmd 
f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
Mar  8 15:56:06 coyote kernel: [    1.001607] ata6.00: ACPI cmd b1/
c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
Mar  8 15:56:06 coyote kernel: [    1.002460] ata6.00: configured for 
UDMA/133

And this is the bootup I'm runniing on.


> > So how do I go about querying mdadm to determine whats going south
> > here?
> As is so often the case with the problems you bring up, we risk
> going down completely the wrong route because you do not supply
> actual error messages or complete problem descriptions and just tell
> us that you've decided it's a problem with XYZ subsystem. Much time
> is spent looking into XYZ only to later find it was irrelevant or
> the process could at least have been seriously improved by knowing
> the symptoms to begin with (infamous "enabling IPv6 causes my
> compile to fail" memories here).
> 
> > The man pages are quite verbose, but I can't seem to find how to
> > query
> > what it has, without supplying the device names of all drives that
> > s/b
> > part of the array.
> > 
> > And why do I see the ata6 error on a reboot, but not on a full
> > powerdown reboot?
> 
> So, WHAT IS "THE ATA6 ERROR"?
> 
> Without knowing that it's very hard to speculate but I would point
> out that things like "ATA" are way below the level of md which just
> knows about block devices. So if you are seeing an "ata6 error" it
> most likely has nothing whatsoever to do with your md setup, except
> in the sense that if you are having problems with your storage
> devices it's obviously going to percolate upwards and cause you RAID
> grief too.
> 
> > This is my first experience at software raid, so I am a new bee. My
> > fingers are at your command.
> 
> Then I command them to tell us useful information BEFORE you decide
> where the problem lies.
> 
> However, if you do want to know more about investigating your mdadm
> setup, you already got the hint about cat /proc/mdstat. Here's some
> other stuff:
> 
> https://raid.wiki.kernel.org/index.php/Asking_for_help
> 
> Thanks,
> Andy
> 
> --
> https://bitfolk.com/ -- No-nonsense VPS hosting
> 
> .


Cheers, Gene Heskett.
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis




Reply to: