From: Marcus Glocker Subject: Re: smmu(4) on QC Laptops To: Patrick Wildt Cc: Mark Kettenis , tech@openbsd.org, tobhe@openbsd.org, mlarkin@openbsd.org Date: Sat, 2 Aug 2025 16:24:01 +0200 On Fri, Aug 01, 2025 at 10:21:21PM +0200, Marcus Glocker wrote: > On Fri, Aug 01, 2025 at 09:21:37PM +0200, Patrick Wildt wrote: > > > On Sat, May 24, 2025 at 07:21:59AM +0200, Patrick Wildt wrote: > > > On Fri, May 23, 2025 at 07:58:29PM +0200, Mark Kettenis wrote: > > > > > Date: Fri, 23 May 2025 08:46:16 +0200 > > > > > From: Patrick Wildt > > > > > Cc: kettenis@openbsd.org, tobhe@openbsd.org, mlarkin@openbsd.org > > > > > Content-Type: text/plain; charset=us-ascii > > > > > Content-Disposition: inline > > > > > > > > > > Hi, > > > > > > > > > > Nearly three years ago we disabled smmu(4) for QC laptops because they > > > > > just rebooted instantly. I've finally gone to the bottom of why this > > > > > happens. > > > > > > > > > > On servers we usually boot up with most peripherals not doing any kind > > > > > of DMA. A serial console is kind of the only thing we depend on, which > > > > > usually doesn't use DMA for simplicity. Everything else, like NVMe or > > > > > USB will get turned on later. For those use cases it's been perfectly > > > > > fine to just turn the SMMU on with strict filtering, because only once > > > > > the driver configures puts in a job DMA transactions will occur. > > > > > > > > > > On the QC laptops the framebuffer goes through an SMMU. That's one of > > > > > the reasons we need to keep the streams alive and actively put them into > > > > > a bypass context, which we do! As soon as a driver attaches, we move > > > > > the stream over into a strict translation context. > > > > > > > > > > Unfortunately the changes I did were not completely sufficient. > > > > > > > > > > (1) The IOMMU enforcement happens before the driver gets a chance to > > > > > stop current work. > > > > > > > > > > simplebus(4) creates an IOMMU-bound DMA tag before it attempts to attach > > > > > the children. This leads do two interesting things: > > > > > > > > > > (a) If a simplebus(4) has an IOMMU assigned, it is passed an IOMMU-bound > > > > > DMA tag. All children will get subjected to it. > > > > > > > > > > smmu1: establishing sid 0x423 > > > > > simplebus1 at simplebus0: "geniqup" > > > > > "serial" at simplebus1 not configured > > > > > smmu1: establishing sid 0x123 > > > > > simplebus2 at simplebus0: "geniqup" > > > > > > > > > > (b) If there's no driver to be found, the IOMMU-bound DMA tag is created > > > > > anyway. > > > > > > > > > > smmu1: establishing sid 0x1c00 > > > > > smmu1: took over 8/1c00/2 for sid 1c00 > > > > > "display-subsystem" at simplebus0 not configured > > > > > > > > > > The good thing is that (a) is not a problem in this case as we do not > > > > > have any active transactions such mappings, as far as I can see on the > > > > > X1E. But the framebuffer is behind the display-subsystem which is now > > > > > subjected to strict IOMMU configuration, making us need to fix (b). > > > > > > > > > > One idea I have is that we keep all the setup code where it is, at > > > > > IOMMU-bound DMA tag creation time, but only move the stream to the > > > > > strict context block once someone(TM) creates a DMA map using this > > > > > tag. > > > > > > > > > > Is that a reasonable thing to do? I don't think anyone would be doing > > > > > DMA without creating a DMA map first, right? > > > > > > > > Correct. > > > > > > > > I'm not sure how this is going to work if we want to write a driver > > > > for the display controller. Presumably we'd create IOMMU mappings for > > > > the initial framebuffer. But to do so, we'd need to create a DMA map > > > > first, and when we do that... game over? > > > > > > > > So maybe we need to do the takeover when we do the first DMA map load, > > > > after we've entered the mappings into the page table? But that only > > > > works if we have a single mapping for the framebuffer. > > > > > > I was hoping that we'd have code that first turns off the transactions, > > > making us "go blind" until we re-enable it with the new address. But > > > yes, that would require the disable sequence to happen before the map > > > is created. We could do it on load, which is kind of a hot-path. I > > > was hoping for now we put it into map create and tackle this once or > > > if we get around to writing this driver. > > > > > > > On the Apple systems, the framebuffer is also behind an IOMMU. There > > > > we have some code that creates some initial mappings based on > > > > information from the device tree. > > > > > > > > > (2) Re-use of SMRs doesn't take mask in account. > > > > > > > > > > In (1b) you can see that the display subsystem is using sid 0x1c00 with > > > > > mask 0x2, matching to the following SMR: > > > > > > > > > > smmu1: SMR[8] = 0x1c00/0x2 > > > > > > > > > > We'd like to re-use that if possible, but because of the strict mask > > > > > checking that wasn't possible. So I'd like to take the mask into > > > > > consideration when "taking over" streams. This is only done on QC! > > > > > > > > Sorry, you'll have to explain to me what what the SMRs do... > > > > > > Think work slightly different depending on SMMU version/feature flags, > > > but for this machine it works an IOMMU setup like this: > > > > > > You configure a context block. The context block is basically the IOMMU > > > group that owns the pagetable. To assign devices to a context block, > > > you setup a "Stream Mapping Register", e.g.: > > > > > > * Allocate + setup context block 3 (e.g. IOMMU group 3) > > > * This is the actual pagetable and translation setup > > > * Allocate a stream map entry (can be any, e.g. 5) > > > * The array has a fixed length, we need to use one slot to > > > point device to context block. > > > * Write(SMR(5), DeviceId + Mask + Valid) > > > * For each DMA transaction, the SMMUs stream mapping function will > > > iterate the stream map registers to find an enty where: > > > StreamId & Mask == DeviceId & Mask > > > * In this example, after writing SMR(5), slot 5 should match > > > * Write(S2CR(5) = 3) > > > * This points stream map 5 to context block 3 > > > * In this example, after the HW's lookup function returns 5, it will > > > look into S2CR(5) to find which context block it belongs to > > > > > > On this laptop, the SMRs basically tell you what devices are currently > > > enabled to pass through the IOMMU. We update the SMRs to point all > > > currenetly allowed devices to a "bypass domain", so that existing > > > traffic goes into a 1:1 translation. > > > > > > One we want attach a driver and want to make a device use stricter IOMMU > > > mappings, we setup a context block and then switch the device from the > > > bypass IOMMU group/context block to the strict IOMMU group/context > > > block. > > > > > > In reality we're not actually changing the SMR (that's just the filter > > > rule), we're updating the S2CR to point the stream to a different group. > > > > > > But if we don't "take over" the existing SMR mapping, the code will > > > allocate a new SMR register, and then the IOMMU will complain about a > > > match conflict because an incoming transaction matches two SMR entries. > > > > > > Let me know if I worded this too confusingly! > > > > > > > > Please give this a go on machines where `dmesg | grep ^smmu' gives > > > > > output, and especially on QC machines with X1E, or SC8280XP/ThinkPad > > > > > x13s. Please send me some dmesgs, as I have added some debug code > > > > > that I'd like to see run on other machines. :) > > > > > > > > Here's the Ampere eMAG machine. ACPI so some of your changes don't > > > > matter, but it still works with your diff. > > > > > > > > > > Thanks! :) > > > > > > Cheers, > > > Patrick > > > > Updated diff that sets the SMR register not just to the stream id but > > also the mask we kept saved. > > Following the fdt dmesgs without, and with your patch, from my > Samsung Galaxy Book4 Edge. I'm not sure what has changed in the meantime, but when I retry your patch with the latest kernel sources, I'm getting a panic now during boot triggered by the ufshci(4) DMA allocation: https://nazgul.ch/pub/smmu_panic.jpg I haven't investigated further yet, just noticed it now.