From: Martijn van Duren Subject: Re: azalia(4) testers wanted! To: Mark Kettenis Cc: tech@openbsd.org, alex@caoua.org, brynet@gmail.com, thfr@openbsd.org, obsd@bartula.de Date: Fri, 17 May 2024 09:58:30 +0200 On Thu, 2024-05-16 at 21:31 +0200, Mark Kettenis wrote: > > From: Martijn van Duren > > Date: Thu, 16 May 2024 17:19:41 +0200 > > Hi Folks, > > > > > On Sun, 2023-03-05 at 06:22 -0500, Bryan Steele wrote: > > > On Sun, Mar 05, 2023 at 08:53:00AM +0100, Alexandre Ratchov wrote: > > > > If you've an azalia(4) attaching as "AMD 17h/1xh HD Audio", please > > > > test this diff and report regressions. Especially audio lock ups that > > > > require reboot. > > > > > > > > IIRC, MSI was disabled few years ago to "fix" such lockups, and now > > > > this diff suggests we need MSI on certain boards. > > > > > > > > Context and diff below: > > > > > > > > > > > > > > No, this workaround is still needed. thfr@ and I tried to debug this years > > > ago but could not determine the cause at the time. > > > > > > This audio hang is still there on many systems, e.g: playback works for a > > > breif time until the it hangs, and only a reboot will fix it. But we could > > > never reproduce it with MSI disabled, so that was the best option we had. > > > > > > -Bryan. > > > > > This is a preliminary diff that passes my normal failures, but I haven't > > run for an elaborate amount of time. The problem is that the INTSTS can > > be set again by the controller without generating a new interrupt while > > we're still in the interrupt handler. Somehow I only found the Linux > > equivalent[0][1] when searching for my solution. The linux discussion > > also states that it was discovered on "some intel platforms", which > > matches my earlier statement about it also happening on my intel based > > laptop, but it mostly goes unnoticed there. So far I haven't properly > > tested this diff on my laptop since it occurs far and few in between. > > > > I don't know if this behaviour is timing-based, related to the > > interrupt mask, or maybe something else, so I do realise that this > > method of fixing things is a bit racy. However it is my understanding of > > what Linux does and does appear to fix my problems. The 10 retries is > > taken from Linux, but I haven't seen my card needing more than 1 extra > > iteration so far. > > > > I removed the MSI disable statement entirely so that people with other > > cards with the same symptoms can test it. > > > > I'll keep my computer running for some time to see if I can eventually > > hit the hang, but any feedback/tests is welcome. > > You're on to something. > > The main difference between legacy INTx interrupts and MSIs is that > the former are level-triggered, where as the latter are > edge-triggered. Level-triggered interrupts are a bit more forgiving > in the sense that if you don't handle all interrupts, the interrupt > will just fire again. But edge-triggered interrupts will only trigger > if the interrupt condition is properly cleared and a new interrupt > happens. > > At first sight the driver seems to do the right thing for > edge-triggered interrupts. The handler reads the INTSTS register and > then writes to it to clear the interrupt condition. It is reasonable > to expect that if further interrupts are triggered, we will get a new > MSI message. However, if you look at the spec: > > https://www.intel.com/content/www/us/en/standards/high-definition-audio-specification.html > > you'll see the INTSTS is a RO (read-only) registers; not a W1C > (write-one to clear) register. So on spec-compliant hardware, this > write doesn't do anything! Which means that any new interrupts coming > in while we're processing the initial ones will cause the interrupt to > remain asserted without a new MSI to happen. > > So instead the interrupt handler has to continue to process interrupts > as long as bits for the enabled interrupts are set. Only when none of > the bits are set can we return and be sure a new MSI to happen. > > Below is a new diff that does this and gets rid of all the bogus > INTSTS writes. > > This seems to work fine for me on Intel hardware. > Thanks, with this things make sense to me. I've been stressing my soundcard with multiple streams for about 12 hours now without any problems. OK martijn@ FWIW Going by the comment and your commit message from r1.196, do we still need the following section? /* disable MSI, use INTx instead */ if (PCI_VENDOR(sc->pciid) == PCI_VENDOR_INTEL) { reg = azalia_pci_read(sc->pc, sc->tag, ICH_PCI_MMC); reg &= ~(ICH_PCI_MMC_ME); azalia_pci_write(sc->pc, sc->tag, ICH_PCI_MMC, reg); } With and without this section my laptop attaches with msi and works just fine (only tested in combination with your diff): 0:31:3: Intel 500 Series HD Audio 0x0000: Vendor ID: 8086, Product ID: a0c8 0x0004: Command: 0006, Status: 0010 0x0008: Class: 04 Multimedia, Subclass: 03 HD Audio, Interface: 80, Revision: 20 0x000c: BIST: 00, Header Type: 00, Latency Timer: 00, Cache Line Size: 00 0x0010: BAR mem 64bit addr: 0x0000006001150000/0x00004000 0x0018: BAR empty (00000000) 0x001c: BAR empty (00000000) 0x0020: BAR mem 64bit addr: 0x0000006001000000/0x00100000 0x0028: Cardbus CIS: 00000000 0x002c: Subsystem Vendor ID: 17aa Product ID: 22d1 0x0030: Expansion ROM Base Address: 00000000 0x0038: 00000000 0x003c: Interrupt Pin: 01 Line: ff Min Gnt: 00 Max Lat: 00 0x0050: Capability 0x01: Power Management State: D0 0x0080: Capability 0x09: Vendor Specific 0x0060: Capability 0x05: Message Signalled Interrupts (MSI) Enabled: yes; 1 vectors (1 enabled) dmesg | grep -A1 -e ^azalia -e OpenBSD OpenBSD 7.5-current (GENERIC.MP) #74: Thu May 16 21:23:12 MDT 2024 deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP -- azalia0 at pci0 dev 31 function 3 "Intel 500 Series HD Audio" rev 0x20: msi azalia0: codecs: Realtek ALC257, Intel/0x2812, using Realtek ALC257 audio0 at azalia0 -- OpenBSD 7.5-current (GENERIC.MP) #18: Fri May 17 09:37:21 CEST 2024 martijn@martijn:/sys/arch/amd64/compile/GENERIC.MP -- azalia0 at pci0 dev 31 function 3 "Intel 500 Series HD Audio" rev 0x20: msi azalia0: codecs: Realtek ALC257, Intel/0x2812, using Realtek ALC257 audio0 at azalia0