Download raw body.
smmu(4) on QC Laptops
> Date: Fri, 23 May 2025 08:46:16 +0200
> From: Patrick Wildt <patrick@blueri.se>
> Cc: kettenis@openbsd.org, tobhe@openbsd.org, mlarkin@openbsd.org
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
>
> Hi,
>
> Nearly three years ago we disabled smmu(4) for QC laptops because they
> just rebooted instantly. I've finally gone to the bottom of why this
> happens.
>
> On servers we usually boot up with most peripherals not doing any kind
> of DMA. A serial console is kind of the only thing we depend on, which
> usually doesn't use DMA for simplicity. Everything else, like NVMe or
> USB will get turned on later. For those use cases it's been perfectly
> fine to just turn the SMMU on with strict filtering, because only once
> the driver configures puts in a job DMA transactions will occur.
>
> On the QC laptops the framebuffer goes through an SMMU. That's one of
> the reasons we need to keep the streams alive and actively put them into
> a bypass context, which we do! As soon as a driver attaches, we move
> the stream over into a strict translation context.
>
> Unfortunately the changes I did were not completely sufficient.
>
> (1) The IOMMU enforcement happens before the driver gets a chance to
> stop current work.
>
> simplebus(4) creates an IOMMU-bound DMA tag before it attempts to attach
> the children. This leads do two interesting things:
>
> (a) If a simplebus(4) has an IOMMU assigned, it is passed an IOMMU-bound
> DMA tag. All children will get subjected to it.
>
> smmu1: establishing sid 0x423
> simplebus1 at simplebus0: "geniqup"
> "serial" at simplebus1 not configured
> smmu1: establishing sid 0x123
> simplebus2 at simplebus0: "geniqup"
>
> (b) If there's no driver to be found, the IOMMU-bound DMA tag is created
> anyway.
>
> smmu1: establishing sid 0x1c00
> smmu1: took over 8/1c00/2 for sid 1c00
> "display-subsystem" at simplebus0 not configured
>
> The good thing is that (a) is not a problem in this case as we do not
> have any active transactions such mappings, as far as I can see on the
> X1E. But the framebuffer is behind the display-subsystem which is now
> subjected to strict IOMMU configuration, making us need to fix (b).
>
> One idea I have is that we keep all the setup code where it is, at
> IOMMU-bound DMA tag creation time, but only move the stream to the
> strict context block once someone(TM) creates a DMA map using this
> tag.
>
> Is that a reasonable thing to do? I don't think anyone would be doing
> DMA without creating a DMA map first, right?
Correct.
I'm not sure how this is going to work if we want to write a driver
for the display controller. Presumably we'd create IOMMU mappings for
the initial framebuffer. But to do so, we'd need to create a DMA map
first, and when we do that... game over?
So maybe we need to do the takeover when we do the first DMA map load,
after we've entered the mappings into the page table? But that only
works if we have a single mapping for the framebuffer.
On the Apple systems, the framebuffer is also behind an IOMMU. There
we have some code that creates some initial mappings based on
information from the device tree.
> (2) Re-use of SMRs doesn't take mask in account.
>
> In (1b) you can see that the display subsystem is using sid 0x1c00 with
> mask 0x2, matching to the following SMR:
>
> smmu1: SMR[8] = 0x1c00/0x2
>
> We'd like to re-use that if possible, but because of the strict mask
> checking that wasn't possible. So I'd like to take the mask into
> consideration when "taking over" streams. This is only done on QC!
Sorry, you'll have to explain to me what what the SMRs do...
> Please give this a go on machines where `dmesg | grep ^smmu' gives
> output, and especially on QC machines with X1E, or SC8280XP/ThinkPad
> x13s. Please send me some dmesgs, as I have added some debug code
> that I'd like to see run on other machines. :)
Here's the Ampere eMAG machine. ACPI so some of your changes don't
matter, but it still works with your diff.
OpenBSD 7.7-current (GENERIC.MP) #1: Fri May 23 19:28:38 CEST 2025
kettenis@liszt.sibelius.xs4all.nl:/home/kettenis/src/smmu/sys/arch/arm64/compile/GENERIC.MP
real mem = 137121468416 (130769MB)
avail mem = 132743782400 (126594MB)
random: good seed from bootblocks
mainbus0 at root: ACPI
psci0 at mainbus0: PSCI 1.1, SMCCC 1.0
efi0 at mainbus0: UEFI 2.7
efi0: American Megatrends rev 0x5000d
smbios0 at efi0: SMBIOS 3.2.0
smbios0: vendor LENOVO version "hve104r-1.15" date 02/26/2021
smbios0: Lenovo HR330A 7X33CTO1WW
cpu0 at mainbus0 mpidr 0: Applied Micro X-Gene r3p2
cpu0: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu0: 256KB 64b/line 32-way L2 cache
cpu0: CRC32,SHA2,SHA1,AES+PMULL,ASID16
cpu1 at mainbus0 mpidr 1: Applied Micro X-Gene r3p2
cpu1: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu1: 256KB 64b/line 32-way L2 cache
cpu2 at mainbus0 mpidr 100: Applied Micro X-Gene r3p2
cpu2: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu2: 256KB 64b/line 32-way L2 cache
cpu3 at mainbus0 mpidr 101: Applied Micro X-Gene r3p2
cpu3: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu3: 256KB 64b/line 32-way L2 cache
cpu4 at mainbus0 mpidr 200: Applied Micro X-Gene r3p2
cpu4: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu4: 256KB 64b/line 32-way L2 cache
cpu5 at mainbus0 mpidr 201: Applied Micro X-Gene r3p2
cpu5: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu5: 256KB 64b/line 32-way L2 cache
cpu6 at mainbus0 mpidr 300: Applied Micro X-Gene r3p2
cpu6: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu6: 256KB 64b/line 32-way L2 cache
cpu7 at mainbus0 mpidr 301: Applied Micro X-Gene r3p2
cpu7: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu7: 256KB 64b/line 32-way L2 cache
cpu8 at mainbus0 mpidr 400: Applied Micro X-Gene r3p2
cpu8: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu8: 256KB 64b/line 32-way L2 cache
cpu9 at mainbus0 mpidr 401: Applied Micro X-Gene r3p2
cpu9: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu9: 256KB 64b/line 32-way L2 cache
cpu10 at mainbus0 mpidr 500: Applied Micro X-Gene r3p2
cpu10: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu10: 256KB 64b/line 32-way L2 cache
cpu11 at mainbus0 mpidr 501: Applied Micro X-Gene r3p2
cpu11: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu11: 256KB 64b/line 32-way L2 cache
cpu12 at mainbus0 mpidr 600: Applied Micro X-Gene r3p2
cpu12: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu12: 256KB 64b/line 32-way L2 cache
cpu13 at mainbus0 mpidr 601: Applied Micro X-Gene r3p2
cpu13: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu13: 256KB 64b/line 32-way L2 cache
cpu14 at mainbus0 mpidr 700: Applied Micro X-Gene r3p2
cpu14: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu14: 256KB 64b/line 32-way L2 cache
cpu15 at mainbus0 mpidr 701: Applied Micro X-Gene r3p2
cpu15: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu15: 256KB 64b/line 32-way L2 cache
cpu16 at mainbus0 mpidr 800: Applied Micro X-Gene r3p2
cpu16: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu16: 256KB 64b/line 32-way L2 cache
cpu17 at mainbus0 mpidr 801: Applied Micro X-Gene r3p2
cpu17: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu17: 256KB 64b/line 32-way L2 cache
cpu18 at mainbus0 mpidr 900: Applied Micro X-Gene r3p2
cpu18: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu18: 256KB 64b/line 32-way L2 cache
cpu19 at mainbus0 mpidr 901: Applied Micro X-Gene r3p2
cpu19: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu19: 256KB 64b/line 32-way L2 cache
cpu20 at mainbus0 mpidr a00: Applied Micro X-Gene r3p2
cpu20: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu20: 256KB 64b/line 32-way L2 cache
cpu21 at mainbus0 mpidr a01: Applied Micro X-Gene r3p2
cpu21: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu21: 256KB 64b/line 32-way L2 cache
cpu22 at mainbus0 mpidr b00: Applied Micro X-Gene r3p2
cpu22: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu22: 256KB 64b/line 32-way L2 cache
cpu23 at mainbus0 mpidr b01: Applied Micro X-Gene r3p2
cpu23: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu23: 256KB 64b/line 32-way L2 cache
cpu24 at mainbus0 mpidr c00: Applied Micro X-Gene r3p2
cpu24: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu24: 256KB 64b/line 32-way L2 cache
cpu25 at mainbus0 mpidr c01: Applied Micro X-Gene r3p2
cpu25: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu25: 256KB 64b/line 32-way L2 cache
cpu26 at mainbus0 mpidr d00: Applied Micro X-Gene r3p2
cpu26: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu26: 256KB 64b/line 32-way L2 cache
cpu27 at mainbus0 mpidr d01: Applied Micro X-Gene r3p2
cpu27: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu27: 256KB 64b/line 32-way L2 cache
cpu28 at mainbus0 mpidr e00: Applied Micro X-Gene r3p2
cpu28: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu28: 256KB 64b/line 32-way L2 cache
cpu29 at mainbus0 mpidr e01: Applied Micro X-Gene r3p2
cpu29: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu29: 256KB 64b/line 32-way L2 cache
cpu30 at mainbus0 mpidr f00: Applied Micro X-Gene r3p2
cpu30: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu30: 256KB 64b/line 32-way L2 cache
cpu31 at mainbus0 mpidr f01: Applied Micro X-Gene r3p2
cpu31: 32KB 64b/line 8-way L1 PIPT I-cache, 32KB 64b/line 8-way L1 D-cache
cpu31: 256KB 64b/line 32-way L2 cache
apm0 at mainbus0
agintc0 at mainbus0 shift 4:4 nirq 544 nredist 32 ipi: 0, 1, 2: "interrupt-controller"
agintcmsi0 at agintc0
agtimer0 at mainbus0: 40000 kHz
acpi0 at mainbus0: ACPI 6.1
acpi0: sleep states
acpi0: tables DSDT FACP FIDT DBG2 GTDT IORT MCFG SSDT SPMI APIC PCCT BERT HEST VFCT SPCR PPTT
acpi0: wakeup devices
acpiiort0 at acpi0
smmu0 at acpiiort0 addr 0x14000000/0x100000: 128 CBs (128 S2-only)
smmu1 at acpiiort0 addr 0x15000000/0x100000: 128 CBs (128 S2-only)
acpimcfg0 at acpi0
acpimcfg0: addr 0x10000000000, bus 0-31
acpimcfg0: addr 0x7800000000, bus 0-31
acpimcfg0: addr 0x1000000000, bus 0-31
acpimcfg0: addr 0x5800000000, bus 0-31
acpimcfg0: addr 0x6000000000, bus 0-31
acpimcfg0: addr 0x7000000000, bus 0-7
acpimcfg0: addr 0x600000000, bus 0-7
acpimcfg0: addr 0x400000000, bus 0-7
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"ACPI0010" at acpi0 not configured
"APMC0D40" at acpi0 not configured
"APMC0D40" at acpi0 not configured
dwiic0 at acpi0 I2C4 addr 0x126b0000/0x1000 irq 105
iic0 at dwiic0
ipmi0 at iic0 addr 0x10: version 2.0 interface SSIF
pluart0 at acpi0 URT0 addr 0x12600000/0x1000 irq 98
pluart0: console
pluart1 at acpi0 URT1 addr 0x12610000/0x1000 irq 99
smmu1: establishing sid 0x0
ahci0 at acpi0 SAT0 addr 0x1c000000/0x1000 irq 111: AHCI 1.3.1
ahci0: port 0: 6.0Gb/s
scsibus0 at ahci0: 32 targets
sd0 at scsibus0 targ 0 lun 0: <ATA, Samsung SSD 860, RVT0> naa.5002538e404a2eee
sd0: 238475MB, 512 bytes/sector, 488397168 sectors, thin
smmu1: establishing sid 0x800
ahci1 at acpi0 SAT1 addr 0x1c100000/0x1000 irq 112: AHCI 1.3.1
scsibus1 at ahci1: 32 targets
smmu1: establishing sid 0x1000
xhci0 at acpi0 USB0 addr 0x13800000/0x100000 irq 115, xHCI 1.10
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 addr 1
smmu1: establishing sid 0x1800
xhci1 at acpi0 USB1 addr 0x13900000/0x100000 irq 116, xHCI 1.10
usb1 at xhci1: USB revision 3.0
uhub1 at usb1 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 addr 1
acpibtn0 at acpi0: PWRB
acpige0 at acpi0 irq 84
acpige1 at acpi0 irq 72
"PNP0C33" at acpi0 not configured
"LNRO0007" at acpi0 not configured
"APMC0D83" at acpi0 not configured
"APMC0D84" at acpi0 not configured
"APMC0D84" at acpi0 not configured
"APMC0D84" at acpi0 not configured
"APMC0D84" at acpi0 not configured
"APMC0D84" at acpi0 not configured
"APMC0D84" at acpi0 not configured
"APMC0D84" at acpi0 not configured
"APMC0D84" at acpi0 not configured
"APMC0D87" at acpi0 not configured
"APMC0D87" at acpi0 not configured
"APMC0D88" at acpi0 not configured
"APMC0D88" at acpi0 not configured
"APMC0D88" at acpi0 not configured
"APMC0D88" at acpi0 not configured
"APMC0D88" at acpi0 not configured
"APMC0D88" at acpi0 not configured
"APMC0D88" at acpi0 not configured
"APMC0D88" at acpi0 not configured
"APMC0D85" at acpi0 not configured
"APMC0D86" at acpi0 not configured
acpipci0 at acpi0 PCI0
pci0 at acpipci0
smmu0: establishing sid 0x0
ppb0 at pci0 dev 0 function 0 "Ampere eMAG PCIe" rev 0x04: irq 131
pci1 at ppb0 bus 1
smmu0: establishing sid 0x100
em0 at pci1 dev 0 function 0 "Intel I210" rev 0x03: msi, address 00:1b:21:e0:6f:7d
acpipci1 at acpi0 PCI2
pci2 at acpipci1
smmu1: establishing sid 0x4000
0:0:0: bridge io address conflict 0x10000000/0x1000
ppb1 at pci2 dev 0 function 0 "Ampere eMAG PCIe" rev 0x04: irq 143
pci3 at ppb1 bus 1
smmu1: establishing sid 0x4100
smmu1: establishing sid 0x4101
amdgpu0 at pci3 dev 0 function 0 "ATI Polaris 12" rev 0x00
drm0 at amdgpu0
amdgpu0: msi
azalia0 at pci3 dev 0 function 1 "ATI Radeon Pro Audio" rev 0x00: msi
azalia0: no supported codecs
acpipci2 at acpi0 PCI3
pci4 at acpipci2
smmu1: establishing sid 0x6000
ppb2 at pci4 dev 0 function 0 "Ampere eMAG PCIe" rev 0x04: irq 149
pci5 at ppb2 bus 1
acpipci3 at acpi0 PCI4
pci6 at acpipci3
smmu0: establishing sid 0x4000
ppb3 at pci6 dev 0 function 0 "Ampere eMAG PCIe" rev 0x04: irq 155
pci7 at ppb3 bus 1
acpipci4 at acpi0 PCI5
pci8 at acpipci4
smmu0: establishing sid 0x6000
ppb4 at pci8 dev 0 function 0 "Ampere eMAG PCIe" rev 0x04: irq 161
pci9 at ppb4 bus 1
acpipci5 at acpi0 PCI6
pci10 at acpipci5
smmu1: establishing sid 0x2000
ppb5 at pci10 dev 0 function 0 "Ampere eMAG PCIe" rev 0x04: irq 167
pci11 at ppb5 bus 1
acpipci6 at acpi0 PCI7
pci12 at acpipci6
smmu1: establishing sid 0x2800
0:0:0: bridge io address conflict 0x10000000/0x1000
ppb6 at pci12 dev 0 function 0 "Ampere eMAG PCIe" rev 0x04: irq 173
pci13 at ppb6 bus 1
smmu1: establishing sid 0x2900
1:0:0: bridge io address conflict 0x10000000/0x1000
ppb7 at pci13 dev 0 function 0 "ASPEED Technology AST1150 PCI" rev 0x04
pci14 at ppb7 bus 2
smmu1: establishing sid 0x2a00
"ASPEED Technology AST2000" rev 0x41 at pci14 dev 0 function 0 not configured
uhub2 at uhub0 port 1 configuration 1 interface 0 "American Megatrends Inc. Virtual Hub" rev 2.00/1.00 addr 2
umass0 at uhub2 port 1 configuration 1 interface 0 "American Megatrends Inc. Virtual Cdrom Device" rev 2.00/1.00 addr 3
umass0: using SCSI over Bulk-Only
scsibus2 at umass0: 2 targets, initiator 0
cd0 at scsibus2 targ 1 lun 0: <AMI, Virtual CDROM0, 1.00> removable serial.046bff20AAABBBBCCCC1
umass1 at uhub2 port 2 configuration 1 interface 0 "American Megatrends Inc. Virtual HardDisk Device" rev 2.00/1.00 addr 4
umass1: using SCSI over Bulk-Only
scsibus3 at umass1: 2 targets, initiator 0
sd1 at scsibus3 targ 1 lun 0: <AMI, Virtual HDisk0, 1.00> removable serial.046bff31AAABBBBCCCC3
uhidev0 at uhub2 port 3 configuration 1 interface 0 "American Megatrends Inc. Virtual Keyboard and Mouse" rev 1.10/1.00 addr 5
uhidev0: iclass 3/1
ukbd0 at uhidev0: 8 variable keys, 6 key codes
wskbd0 at ukbd0 mux 1
uhidev1 at uhub2 port 3 configuration 1 interface 1 "American Megatrends Inc. Virtual Keyboard and Mouse" rev 1.10/1.00 addr 5
uhidev1: iclass 3/1
ums0 at uhidev1: 3 buttons, Z dir
wsmouse0 at ums0 mux 0
uhub3 at uhub1 port 1 configuration 1 interface 0 "Cypress Semiconductor USB2 Hub" rev 2.00/90.15 addr 2
vscsi0 at root
scsibus4 at vscsi0: 256 targets
softraid0 at root
scsibus5 at softraid0: 256 targets
root on sd0a (8662c5862198d925.a) swap on sd0b dump on sd0b
amdgpu0: POLARIS12 8 CU rev 0x00
amdgpu0: 1024x768, 32bpp
wsdisplay0 at amdgpu0 mux 1
wskbd0: connecting to wsdisplay0
wsdisplay0: screen 0-5 added (std, vt100 emulation)
simplefb0 at mainbus0: 1024x768, 32bpp
wsdisplay1 at simplefb0 mux 1
wsdisplay1: screen 0-5 added (std, vt100 emulation)
smmu(4) on QC Laptops