Download raw body.
smmu(4) on QC Laptops
On Sat, May 24, 2025 at 07:21:59AM +0200, Patrick Wildt wrote:
> On Fri, May 23, 2025 at 07:58:29PM +0200, Mark Kettenis wrote:
> > > Date: Fri, 23 May 2025 08:46:16 +0200
> > > From: Patrick Wildt <patrick@blueri.se>
> > > Cc: kettenis@openbsd.org, tobhe@openbsd.org, mlarkin@openbsd.org
> > > Content-Type: text/plain; charset=us-ascii
> > > Content-Disposition: inline
> > >
> > > Hi,
> > >
> > > Nearly three years ago we disabled smmu(4) for QC laptops because they
> > > just rebooted instantly. I've finally gone to the bottom of why this
> > > happens.
> > >
> > > On servers we usually boot up with most peripherals not doing any kind
> > > of DMA. A serial console is kind of the only thing we depend on, which
> > > usually doesn't use DMA for simplicity. Everything else, like NVMe or
> > > USB will get turned on later. For those use cases it's been perfectly
> > > fine to just turn the SMMU on with strict filtering, because only once
> > > the driver configures puts in a job DMA transactions will occur.
> > >
> > > On the QC laptops the framebuffer goes through an SMMU. That's one of
> > > the reasons we need to keep the streams alive and actively put them into
> > > a bypass context, which we do! As soon as a driver attaches, we move
> > > the stream over into a strict translation context.
> > >
> > > Unfortunately the changes I did were not completely sufficient.
> > >
> > > (1) The IOMMU enforcement happens before the driver gets a chance to
> > > stop current work.
> > >
> > > simplebus(4) creates an IOMMU-bound DMA tag before it attempts to attach
> > > the children. This leads do two interesting things:
> > >
> > > (a) If a simplebus(4) has an IOMMU assigned, it is passed an IOMMU-bound
> > > DMA tag. All children will get subjected to it.
> > >
> > > smmu1: establishing sid 0x423
> > > simplebus1 at simplebus0: "geniqup"
> > > "serial" at simplebus1 not configured
> > > smmu1: establishing sid 0x123
> > > simplebus2 at simplebus0: "geniqup"
> > >
> > > (b) If there's no driver to be found, the IOMMU-bound DMA tag is created
> > > anyway.
> > >
> > > smmu1: establishing sid 0x1c00
> > > smmu1: took over 8/1c00/2 for sid 1c00
> > > "display-subsystem" at simplebus0 not configured
> > >
> > > The good thing is that (a) is not a problem in this case as we do not
> > > have any active transactions such mappings, as far as I can see on the
> > > X1E. But the framebuffer is behind the display-subsystem which is now
> > > subjected to strict IOMMU configuration, making us need to fix (b).
> > >
> > > One idea I have is that we keep all the setup code where it is, at
> > > IOMMU-bound DMA tag creation time, but only move the stream to the
> > > strict context block once someone(TM) creates a DMA map using this
> > > tag.
> > >
> > > Is that a reasonable thing to do? I don't think anyone would be doing
> > > DMA without creating a DMA map first, right?
> >
> > Correct.
> >
> > I'm not sure how this is going to work if we want to write a driver
> > for the display controller. Presumably we'd create IOMMU mappings for
> > the initial framebuffer. But to do so, we'd need to create a DMA map
> > first, and when we do that... game over?
> >
> > So maybe we need to do the takeover when we do the first DMA map load,
> > after we've entered the mappings into the page table? But that only
> > works if we have a single mapping for the framebuffer.
>
> I was hoping that we'd have code that first turns off the transactions,
> making us "go blind" until we re-enable it with the new address. But
> yes, that would require the disable sequence to happen before the map
> is created. We could do it on load, which is kind of a hot-path. I
> was hoping for now we put it into map create and tackle this once or
> if we get around to writing this driver.
>
> > On the Apple systems, the framebuffer is also behind an IOMMU. There
> > we have some code that creates some initial mappings based on
> > information from the device tree.
> >
> > > (2) Re-use of SMRs doesn't take mask in account.
> > >
> > > In (1b) you can see that the display subsystem is using sid 0x1c00 with
> > > mask 0x2, matching to the following SMR:
> > >
> > > smmu1: SMR[8] = 0x1c00/0x2
> > >
> > > We'd like to re-use that if possible, but because of the strict mask
> > > checking that wasn't possible. So I'd like to take the mask into
> > > consideration when "taking over" streams. This is only done on QC!
> >
> > Sorry, you'll have to explain to me what what the SMRs do...
>
> Think work slightly different depending on SMMU version/feature flags,
> but for this machine it works an IOMMU setup like this:
>
> You configure a context block. The context block is basically the IOMMU
> group that owns the pagetable. To assign devices to a context block,
> you setup a "Stream Mapping Register", e.g.:
>
> * Allocate + setup context block 3 (e.g. IOMMU group 3)
> * This is the actual pagetable and translation setup
> * Allocate a stream map entry (can be any, e.g. 5)
> * The array has a fixed length, we need to use one slot to
> point device to context block.
> * Write(SMR(5), DeviceId + Mask + Valid)
> * For each DMA transaction, the SMMUs stream mapping function will
> iterate the stream map registers to find an enty where:
> StreamId & Mask == DeviceId & Mask
> * In this example, after writing SMR(5), slot 5 should match
> * Write(S2CR(5) = 3)
> * This points stream map 5 to context block 3
> * In this example, after the HW's lookup function returns 5, it will
> look into S2CR(5) to find which context block it belongs to
>
> On this laptop, the SMRs basically tell you what devices are currently
> enabled to pass through the IOMMU. We update the SMRs to point all
> currenetly allowed devices to a "bypass domain", so that existing
> traffic goes into a 1:1 translation.
>
> One we want attach a driver and want to make a device use stricter IOMMU
> mappings, we setup a context block and then switch the device from the
> bypass IOMMU group/context block to the strict IOMMU group/context
> block.
>
> In reality we're not actually changing the SMR (that's just the filter
> rule), we're updating the S2CR to point the stream to a different group.
>
> But if we don't "take over" the existing SMR mapping, the code will
> allocate a new SMR register, and then the IOMMU will complain about a
> match conflict because an incoming transaction matches two SMR entries.
>
> Let me know if I worded this too confusingly!
>
> > > Please give this a go on machines where `dmesg | grep ^smmu' gives
> > > output, and especially on QC machines with X1E, or SC8280XP/ThinkPad
> > > x13s. Please send me some dmesgs, as I have added some debug code
> > > that I'd like to see run on other machines. :)
> >
> > Here's the Ampere eMAG machine. ACPI so some of your changes don't
> > matter, but it still works with your diff.
> >
>
> Thanks! :)
>
> Cheers,
> Patrick
Updated diff that sets the SMR register not just to the stream id but
also the mask we kept saved.
diff --git a/sys/arch/arm64/conf/GENERIC b/sys/arch/arm64/conf/GENERIC
index adee39bb278..ac593d042ab 100644
--- a/sys/arch/arm64/conf/GENERIC
+++ b/sys/arch/arm64/conf/GENERIC
@@ -93,7 +93,7 @@ pci* at pciecam?
sdhc* at fdt?
sdmmc* at sdhc?
bwfm* at sdmmc? # Broadcom FullMAC
-smmu* at fdt?
+smmu* at fdt? early 1
xhci* at fdt?
ccp* at fdt? # AMD Cryptographic Co-processor
ipmi* at fdt?
diff --git a/sys/arch/arm64/dev/smmu.c b/sys/arch/arm64/dev/smmu.c
index edf5dd26de8..76ef21ae365 100644
--- a/sys/arch/arm64/dev/smmu.c
+++ b/sys/arch/arm64/dev/smmu.c
@@ -614,11 +614,11 @@ smmu_domain_create(struct smmu_softc *sc, uint32_t sid)
/* Take over QCOM SMRs */
if (sc->sc_is_qcom && sc->sc_smr[i] != NULL &&
sc->sc_smr[i]->ss_dom == NULL &&
- sc->sc_smr[i]->ss_id == sid &&
- sc->sc_smr[i]->ss_mask == 0) {
- free(sc->sc_smr[i], M_DEVBUF,
- sizeof(struct smmu_smr));
- sc->sc_smr[i] = NULL;
+ !((sc->sc_smr[i]->ss_id ^ sid) &
+ ~sc->sc_smr[i]->ss_mask)) {
+ sc->sc_smr[i]->ss_dom = dom;
+ dom->sd_smr_idx = i;
+ break;
}
if (sc->sc_smr[i] != NULL)
continue;
@@ -771,6 +771,28 @@ smmu_domain_create(struct smmu_softc *sc, uint32_t sid)
reg |= SMMU_CB_SCTLR_ASIDPNE;
smmu_cb_write_4(sc, dom->sd_cb_idx, SMMU_CB_SCTLR, reg);
+ snprintf(dom->sd_exname, sizeof(dom->sd_exname), "%s:%x",
+ sc->sc_dev.dv_xname, sid);
+ dom->sd_iovamap = extent_create(dom->sd_exname, 0,
+ (1LL << iovabits) - 1, M_DEVBUF, NULL, 0, EX_WAITOK |
+ EX_NOCOALESCE);
+
+ /* Reserve first page (to catch NULL access) */
+ extent_alloc_region(dom->sd_iovamap, 0, PAGE_SIZE, EX_WAITOK);
+
+ SIMPLEQ_INSERT_TAIL(&sc->sc_domains, dom, sd_list);
+ return dom;
+}
+
+static void
+smmu_domain_enable(struct smmu_domain *dom)
+{
+ struct smmu_softc *sc = dom->sd_sc;
+ uint32_t reg;
+
+ if (dom->sd_enabled)
+ return;
+
/* Point stream to context block */
reg = SMMU_S2CR_TYPE_TRANS | dom->sd_cb_idx;
if (sc->sc_has_exids && sc->sc_smr)
@@ -779,23 +801,14 @@ smmu_domain_create(struct smmu_softc *sc, uint32_t sid)
/* Map stream idx to S2CR idx */
if (sc->sc_smr) {
- reg = sid;
+ reg = sc->sc_smr[dom->sd_smr_idx]->ss_id << SMMU_SMR_ID_SHIFT |
+ sc->sc_smr[dom->sd_smr_idx]->ss_mask << SMMU_SMR_MASK_SHIFT;
if (!sc->sc_has_exids)
reg |= SMMU_SMR_VALID;
smmu_gr0_write_4(sc, SMMU_SMR(dom->sd_smr_idx), reg);
}
- snprintf(dom->sd_exname, sizeof(dom->sd_exname), "%s:%x",
- sc->sc_dev.dv_xname, sid);
- dom->sd_iovamap = extent_create(dom->sd_exname, 0,
- (1LL << iovabits) - 1, M_DEVBUF, NULL, 0, EX_WAITOK |
- EX_NOCOALESCE);
-
- /* Reserve first page (to catch NULL access) */
- extent_alloc_region(dom->sd_iovamap, 0, PAGE_SIZE, EX_WAITOK);
-
- SIMPLEQ_INSERT_TAIL(&sc->sc_domains, dom, sd_list);
- return dom;
+ dom->sd_enabled = 1;
}
void
@@ -1299,6 +1312,8 @@ smmu_dmamap_create(bus_dma_tag_t t, bus_size_t size, int nsegments,
u_long dva, len;
int error;
+ smmu_domain_enable(dom);
+
error = sc->sc_dmat->_dmamap_create(sc->sc_dmat, size,
nsegments, maxsegsz, boundary, flags, &map);
if (error)
diff --git a/sys/arch/arm64/dev/smmu_fdt.c b/sys/arch/arm64/dev/smmu_fdt.c
index 2c645da3f37..ac9ad9a9089 100644
--- a/sys/arch/arm64/dev/smmu_fdt.c
+++ b/sys/arch/arm64/dev/smmu_fdt.c
@@ -89,11 +89,6 @@ smmu_fdt_attach(struct device *parent, struct device *self, void *aux)
if (OF_getproplen(faa->fa_node, "dma-coherent") == 0)
sc->sc_coherent = 1;
- if (sc->sc_is_qcom) {
- printf(": disabled\n");
- return;
- }
-
if (smmu_attach(sc) != 0)
return;
diff --git a/sys/arch/arm64/dev/smmuvar.h b/sys/arch/arm64/dev/smmuvar.h
index c01e10518ee..03373167089 100644
--- a/sys/arch/arm64/dev/smmuvar.h
+++ b/sys/arch/arm64/dev/smmuvar.h
@@ -18,6 +18,7 @@
struct smmu_softc;
struct smmu_domain {
struct smmu_softc *sd_sc;
+ uint32_t sd_enabled;
uint32_t sd_sid;
bus_dma_tag_t sd_dmat;
int sd_cb_idx;
smmu(4) on QC Laptops