From: "Pavel Knoblokh" Subject: Re: ahci: fix interrupt issues To: "Jan Klemkow" Cc: , Date: Fri, 13 Dec 2024 21:22:59 +1000 Thanks for the fix! If only this patch had come out a month earlier... I tried to debug this and even figured out that the interrupt never happened, but could not find why. I set the timeout to a low value to be able to do anything with the disks, but gave up after a few months of intermittent attempts to fix it. It was still good learning. I think the same problem was reported in [1], and I mentioned two affected controllers at the end of [2]. Unfortunately, I can't test them anymore as I had to install FreeBSD on that machine to get things going. I can still drop off an ASMedia ASM1166 M.2 adapter [3] somewhere around Brisbane as I know some devs live nearby. Please DM if interested in giving it a shot. Also, since you are around, would it be possible to get the patch from [2] through, or explain why not? Any feedback would be really appreciated. Thanks. [1] https://marc.info/?l=openbsd-misc&m=172207535832024&w=2 [2] https://marc.info/?l=openbsd-tech&m=172266168914808&w=2 [3] https://www.aliexpress.com/item/1005005669800665.html Regards, Pavel Knoblokh On Fri Nov 22, 2024 at 3:20 AM AEST, Jan Klemkow wrote: > Hi, > > Attaching ahci(4) with newer AMD 600 Series AHCI controler hangs. This is > caused via atascsi_attach() which creates residual bits in the interrupt status > register. atascsi_attach() runs in polling mode, but sets the interrupt status > bits anyway. When these bits are already set, we never get an interrupt and > run in a 60 second timeout for every ATA command send from sd(4) layer. Which > causes a kernel boot to last over an hour. > > The SATA AHCI 1.3.1 specification also saids: > > 10.1.2 System Software Specific Initialization > > [...] system software must always ensure that the PxIS (clear > this first) and IS.IPS (clear this second) registers are cleared to ‘0’ > before programming the PxIE and GHC.IE registers. This will prevent any > residual bits set in these registers from causing an interrupt to be > asserted. > > The diff below simply clears the interupt status register, which fixes this > issue. > > Tested with: > ahci0 at pci12 dev 0 function 0 "AMD 600 Series AHCI" rev 0x01: msi, AHCI 1.3.1 > > ok? > > bye, > Jan > > Index: dev/ic/ahci.c > =================================================================== > RCS file: /cvs/src/sys/dev/ic/ahci.c,v > diff -u -p -r1.42 ahci.c > --- dev/ic/ahci.c 4 Sep 2024 07:54:52 -0000 1.42 > +++ dev/ic/ahci.c 21 Nov 2024 16:39:16 -0000 > @@ -321,6 +321,9 @@ noccc: > > sc->sc_atascsi = atascsi_attach(&sc->sc_dev, &aaa); > > + /* Flush all residual bits of the interrupt status register */ > + ahci_write(sc, AHCI_REG_IS, ahci_read(sc, AHCI_REG_IS)); > + > /* Enable interrupts */ > ahci_write(sc, AHCI_REG_GHC, AHCI_REG_GHC_AE | AHCI_REG_GHC_IE); >