Index | Thread | Search

From:
Stefan Fritsch <sf@openbsd.org>
Subject:
PCI BAR mapping in qemu VMs
To:
tech@openbsd.org
Cc:
Mark Kettenis <mark.kettenis@xs4all.nl>
Date:
Sun, 16 Mar 2025 10:08:46 +0100

Download raw body.

Thread
Hi,

there were some reports that vio on KVM/qemu sometimes panics with

 vq_size not power of two: 65535

but I could never reproduce it. bluhm@ now got me a test setup where the 
bsd kernel is PXE booted on qemu in 440fx mode, and there it is 
reproducible.

After some debugging it seems that seabios or ipxe maps the PCI BARs at 
0x380000000000-0x380080000000 which is outside the allowed range in 
pci_init_extents(). On the other hand, in 440fx mode, qemu seems to 
produce ACPI 1.x tables and there is a check in acpipci_attach() that for 
ACPI < 5.x, the PCI infos from _CRS are not used. OpenBSD will then 
disable the BARs and when mapping them again in vio_attach(), it will 
sometimes choose adresses that do not work, reads return 0xff and writes 
are ignored. I guess this is becuase the address (in my case 0xbff14000) 
lies outside the PCI window of the emulated chipset.

I have put dmesg, acpi tables and other info at 
https://www.sfritsch.de/~stf/vq-panic/

Qemu in q35 mode produces ACPI 3.x tables, so it may also be affected.

There may be three ways to fix this:

1) increase the allowed range for pcimem in pci_init_extents(). This is 
what the diff below does.

2) somehow make acpipci_attach() use the ACPI infos on qemu. I have 
verified that removing the version check fixes the issue. Since removing 
the version check seems to break many other systems, this would have to be 
a qemu specific quirk.

3) try to make OpenBSD reliably map the BARs somewhere where it works. Is 
there a way for OpenBSD to get the info where the PCI window is without 
trusting ACPI?

I remember at least one report of this issue on i386. Any idea how to fix 
it there?

Cheers,
Stefan


diff --git a/sys/arch/amd64/pci/pci_machdep.c b/sys/arch/amd64/pci/pci_machdep.c
index 78ca6f688b2..ccd6779bb2f 100644
--- a/sys/arch/amd64/pci/pci_machdep.c
+++ b/sys/arch/amd64/pci/pci_machdep.c
@@ -954,13 +954,18 @@ pci_init_extents(void)
 		 * Dell 13G servers have important devices outside the
 		 * 36-bit address space.  Until we can extract the address
 		 * ranges from ACPI, expand the allowed range to suit.
+		 *
+		 * Seabios on qemu in pc-i440fx mode may map BARs at
+		 * 0x380000000000. But it claims ACPI version 1, so acpipci
+		 * will not be used.
+		 * Expand the range to match.
 		 */
 		pcimem_ex = extent_create("pcimem", 0, 0xffffffffffffffffUL,
 		    M_DEVBUF, NULL, 0, EX_NOWAIT);
 		if (pcimem_ex == NULL)
 			return;
-		extent_alloc_region(pcimem_ex, 0x40000000000UL,
-		    0xfffffc0000000000UL, EX_NOWAIT);
+		extent_alloc_region(pcimem_ex, 0x400000000000UL,
+		    0xffffc0000000000UL, EX_NOWAIT);
 
 		for (bmp = bios_memmap; bmp->type != BIOS_MAP_END; bmp++) {
 			/*