From: Andreas Bartelt Subject: Re: azalia(4) testers wanted! To: Mark Kettenis , Martijn van Duren Cc: tech@openbsd.org, alex@caoua.org, brynet@gmail.com, thfr@openbsd.org Date: Fri, 17 May 2024 10:43:19 +0200 I've tried the modified patch from Mark on my Ryzen 7950X / ASUS ProArt X670E-CREATOR WIFI based box. Without the patch, this box has the problem that azalia(4) never works directly after boot. However, it starts to work after suspend (via zzz) and resume. Sometimes, audio output hangs after some time but can be reenabled via another suspend/resume cycle. See also https://marc.info/?l=openbsd-bugs&m=167734571512380&w=2 for the original bug report thread. With the patch, azalia(4) now also works directly(!) after boot which is great. Note that I couldn't conduct any longer term testing of this patch yet. On 5/16/24 21:31, Mark Kettenis wrote: >> From: Martijn van Duren >> Date: Thu, 16 May 2024 17:19:41 +0200 > > Hi Folks, > >> >> On Sun, 2023-03-05 at 06:22 -0500, Bryan Steele wrote: >>> On Sun, Mar 05, 2023 at 08:53:00AM +0100, Alexandre Ratchov wrote: >>>> If you've an azalia(4) attaching as "AMD 17h/1xh HD Audio", please >>>> test this diff and report regressions. Especially audio lock ups that >>>> require reboot. >>>> >>>> IIRC, MSI was disabled few years ago to "fix" such lockups, and now >>>> this diff suggests we need MSI on certain boards. >>>> >>>> Context and diff below: >>>> >>>> >>> >>> No, this workaround is still needed. thfr@ and I tried to debug this years >>> ago but could not determine the cause at the time. >>> >>> This audio hang is still there on many systems, e.g: playback works for a >>> breif time until the it hangs, and only a reboot will fix it. But we could >>> never reproduce it with MSI disabled, so that was the best option we had. >>> >>> -Bryan. >>> >> This is a preliminary diff that passes my normal failures, but I haven't >> run for an elaborate amount of time. The problem is that the INTSTS can >> be set again by the controller without generating a new interrupt while >> we're still in the interrupt handler. Somehow I only found the Linux >> equivalent[0][1] when searching for my solution. The linux discussion >> also states that it was discovered on "some intel platforms", which >> matches my earlier statement about it also happening on my intel based >> laptop, but it mostly goes unnoticed there. So far I haven't properly >> tested this diff on my laptop since it occurs far and few in between. >> >> I don't know if this behaviour is timing-based, related to the >> interrupt mask, or maybe something else, so I do realise that this >> method of fixing things is a bit racy. However it is my understanding of >> what Linux does and does appear to fix my problems. The 10 retries is >> taken from Linux, but I haven't seen my card needing more than 1 extra >> iteration so far. >> >> I removed the MSI disable statement entirely so that people with other >> cards with the same symptoms can test it. >> >> I'll keep my computer running for some time to see if I can eventually >> hit the hang, but any feedback/tests is welcome. > > You're on to something. > > The main difference between legacy INTx interrupts and MSIs is that > the former are level-triggered, where as the latter are > edge-triggered. Level-triggered interrupts are a bit more forgiving > in the sense that if you don't handle all interrupts, the interrupt > will just fire again. But edge-triggered interrupts will only trigger > if the interrupt condition is properly cleared and a new interrupt > happens. > > At first sight the driver seems to do the right thing for > edge-triggered interrupts. The handler reads the INTSTS register and > then writes to it to clear the interrupt condition. It is reasonable > to expect that if further interrupts are triggered, we will get a new > MSI message. However, if you look at the spec: > > https://www.intel.com/content/www/us/en/standards/high-definition-audio-specification.html > > you'll see the INTSTS is a RO (read-only) registers; not a W1C > (write-one to clear) register. So on spec-compliant hardware, this > write doesn't do anything! Which means that any new interrupts coming > in while we're processing the initial ones will cause the interrupt to > remain asserted without a new MSI to happen. > > So instead the interrupt handler has to continue to process interrupts > as long as bits for the enabled interrupts are set. Only when none of > the bits are set can we return and be sure a new MSI to happen. > > Below is a new diff that does this and gets rid of all the bogus > INTSTS writes. > > This seems to work fine for me on Intel hardware. > > > Index: dev/pci/azalia.c > =================================================================== > RCS file: /cvs/src/sys/dev/pci/azalia.c,v > retrieving revision 1.286 > diff -u -p -r1.286 azalia.c > --- dev/pci/azalia.c 6 Mar 2024 00:11:25 -0000 1.286 > +++ dev/pci/azalia.c 16 May 2024 19:12:28 -0000 > @@ -176,6 +176,7 @@ typedef struct azalia_t { > int nistreams, nostreams, nbstreams; > stream_t pstream; > stream_t rstream; > + uint32_t intctl; > } azalia_t; > #define XNAME(sc) ((sc)->dev.dv_xname) > #define AZ_READ_1(z, r) bus_space_read_1((z)->iot, (z)->ioh, HDA_##r) > @@ -556,16 +557,6 @@ azalia_pci_attach(struct device *parent, > azalia_pci_write(sc->pc, sc->tag, ICH_PCI_MMC, reg); > } > > - /* disable MSI for AMD Summit Ridge/Raven Ridge HD Audio */ > - if (PCI_VENDOR(sc->pciid) == PCI_VENDOR_AMD) { > - switch (PCI_PRODUCT(sc->pciid)) { > - case PCI_PRODUCT_AMD_17_HDA: > - case PCI_PRODUCT_AMD_17_1X_HDA: > - case PCI_PRODUCT_AMD_HUDSON2_HDA: > - pa->pa_flags &= ~PCI_FLAGS_MSI_ENABLED; > - } > - } > - > /* interrupt */ > if (pci_intr_map_msi(pa, &ih) && pci_intr_map(pa, &ih)) { > printf(": can't map interrupt\n"); > @@ -684,7 +675,6 @@ azalia_pci_detach(struct device *self, i > AZ_WRITE_4(az, INTCTL, 0); > > DPRINTF(("%s: clear interrupts\n", __func__)); > - AZ_WRITE_4(az, INTSTS, HDA_INTSTS_CIS | HDA_INTSTS_GIS); > AZ_WRITE_2(az, STATESTS, HDA_STATESTS_SDIWAKE); > AZ_WRITE_1(az, RIRBSTS, HDA_RIRBSTS_RINTFL | HDA_RIRBSTS_RIRBOIS); > } > @@ -711,29 +701,27 @@ azalia_intr(void *v) > int ret = 0; > > mtx_enter(&audio_lock); > - intsts = AZ_READ_4(az, INTSTS); > - if (intsts == 0 || intsts == 0xffffffff) { > - mtx_leave(&audio_lock); > - return (ret); > - } > - > - AZ_WRITE_4(az, INTSTS, intsts); > + for (;;) { > + intsts = AZ_READ_4(az, INTSTS); > + if ((intsts & az->intctl) == 0 || intsts == 0xffffffff) > + break; > > - if (intsts & az->pstream.intr_bit) { > - azalia_stream_intr(&az->pstream); > - ret = 1; > - } > + if (intsts & az->pstream.intr_bit) { > + azalia_stream_intr(&az->pstream); > + ret = 1; > + } > > - if (intsts & az->rstream.intr_bit) { > - azalia_stream_intr(&az->rstream); > - ret = 1; > - } > + if (intsts & az->rstream.intr_bit) { > + azalia_stream_intr(&az->rstream); > + ret = 1; > + } > > - if ((intsts & HDA_INTSTS_CIS) && > - (AZ_READ_1(az, RIRBCTL) & HDA_RIRBCTL_RINTCTL) && > - (AZ_READ_1(az, RIRBSTS) & HDA_RIRBSTS_RINTFL)) { > - azalia_rirb_intr(az); > - ret = 1; > + if ((intsts & HDA_INTSTS_CIS) && > + (AZ_READ_1(az, RIRBCTL) & HDA_RIRBCTL_RINTCTL) && > + (AZ_READ_1(az, RIRBSTS) & HDA_RIRBSTS_RINTFL)) { > + azalia_rirb_intr(az); > + ret = 1; > + } > } > mtx_leave(&audio_lock); > return (ret); > @@ -918,7 +906,6 @@ azalia_init(azalia_t *az, int resuming) > /* clear interrupt status */ > AZ_WRITE_2(az, STATESTS, HDA_STATESTS_SDIWAKE); > AZ_WRITE_1(az, RIRBSTS, HDA_RIRBSTS_RINTFL | HDA_RIRBSTS_RIRBOIS); > - AZ_WRITE_4(az, INTSTS, HDA_INTSTS_CIS | HDA_INTSTS_GIS); > AZ_WRITE_4(az, DPLBASE, 0); > AZ_WRITE_4(az, DPUBASE, 0); > > @@ -932,8 +919,8 @@ azalia_init(azalia_t *az, int resuming) > if (err) > return(err); > > - AZ_WRITE_4(az, INTCTL, > - AZ_READ_4(az, INTCTL) | HDA_INTCTL_CIE | HDA_INTCTL_GIE); > + az->intctl = HDA_INTCTL_CIE | HDA_INTCTL_GIE; > + AZ_WRITE_4(az, INTCTL, az->intctl); > > return(0); > } > @@ -1421,7 +1408,6 @@ azalia_suspend(azalia_t *az) > > /* stop interrupts and clear status registers */ > AZ_WRITE_4(az, INTCTL, 0); > - AZ_WRITE_4(az, INTSTS, HDA_INTSTS_CIS | HDA_INTSTS_GIS); > AZ_WRITE_2(az, STATESTS, HDA_STATESTS_SDIWAKE); > AZ_WRITE_1(az, RIRBSTS, HDA_RIRBSTS_RINTFL | HDA_RIRBSTS_RIRBOIS); > > @@ -3723,7 +3709,6 @@ azalia_stream_start(stream_t *this) > bdlist_entry_t *bdlist; > bus_addr_t dmaaddr, dmaend; > int err, index; > - uint32_t intctl; > uint8_t ctl2; > > err = azalia_stream_reset(this); > @@ -3768,9 +3753,8 @@ azalia_stream_start(stream_t *this) > if (err) > return EINVAL; > > - intctl = AZ_READ_4(this->az, INTCTL); > - intctl |= this->intr_bit; > - AZ_WRITE_4(this->az, INTCTL, intctl); > + this->az->intctl |= this->intr_bit; > + AZ_WRITE_4(this->az, INTCTL, this->az->intctl); > > STR_WRITE_1(this, CTL, STR_READ_1(this, CTL) | > HDA_SD_CTL_DEIE | HDA_SD_CTL_FEIE | HDA_SD_CTL_IOCE | > @@ -3786,8 +3770,8 @@ azalia_stream_halt(stream_t *this) > ctl = STR_READ_2(this, CTL); > ctl &= ~(HDA_SD_CTL_DEIE | HDA_SD_CTL_FEIE | HDA_SD_CTL_IOCE | HDA_SD_CTL_RUN); > STR_WRITE_2(this, CTL, ctl); > - AZ_WRITE_4(this->az, INTCTL, > - AZ_READ_4(this->az, INTCTL) & ~this->intr_bit); > + this->az->intctl &= ~this->intr_bit; > + AZ_WRITE_4(this->az, INTCTL, this->az->intctl); > azalia_codec_disconnect_stream(this); > > return (0);