Index | Thread | Search

From:
Stefan Fritsch <sf@openbsd.org>
Subject:
Use acpipci in VMs (was: PCI BAR mapping in qemu VMs)
To:
tech@openbsd.org
Cc:
Mark Kettenis <mark.kettenis@xs4all.nl>
Date:
Fri, 21 Mar 2025 14:24:08 +0100

Download raw body.

Thread
Hi,

On Sun, 16 Mar 2025, Stefan Fritsch wrote:
> there were some reports that vio on KVM/qemu sometimes panics with
> 
>  vq_size not power of two: 65535
> 
> but I could never reproduce it. bluhm@ now got me a test setup where the 
> bsd kernel is PXE booted on qemu in 440fx mode, and there it is 
> reproducible.
> 
> After some debugging it seems that seabios or ipxe maps the PCI BARs at 
> 0x380000000000-0x380080000000 which is outside the allowed range in 
> pci_init_extents(). On the other hand, in 440fx mode, qemu seems to 
> produce ACPI 1.x tables and there is a check in acpipci_attach() that for 
> ACPI < 5.x, the PCI infos from _CRS are not used. OpenBSD will then 
> disable the BARs and when mapping them again in vio_attach(), it will 
> sometimes choose adresses that do not work, reads return 0xff and writes 
> are ignored. I guess this is becuase the address (in my case 0xbff14000) 
> lies outside the PCI window of the emulated chipset.
> 
> I have put dmesg, acpi tables and other info at 
> https://www.sfritsch.de/~stf/vq-panic/
> 
> Qemu in q35 mode produces ACPI 3.x tables, so it may also be affected.
> 
> There may be three ways to fix this:
> 
> 1) increase the allowed range for pcimem in pci_init_extents(). This is 
> what the diff below does.
> 
> 2) somehow make acpipci_attach() use the ACPI infos on qemu. I have 
> verified that removing the version check fixes the issue. Since removing 
> the version check seems to break many other systems, this would have to be 
> a qemu specific quirk.
> 
> 3) try to make OpenBSD reliably map the BARs somewhere where it works. Is 
> there a way for OpenBSD to get the info where the PCI window is without 
> trusting ACPI?
> 
> I remember at least one report of this issue on i386. Any idea how to fix 
> it there?

Mark Patruck noticed that these issues seem to be caused by some 
relatively recent changes in seabios.

https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/message/R7FOQMMYWVX577QNIA2AKUAGOZKNJIAP/
https://gitlab.com/qemu-project/seabios/-/commit/df9dd418b3b0e586cb208125094620fc7f90f23d

A workaround seems to be to configure the VM with <= 3GB memory.

The problem may become more wide-spread with 7.7, since we now default to 
virtio 1.x, which uses MMIO on qemu, compared to virtio 0.9 which uses PIO 
BARs. Therefore it would be nice to get a fix in before the release, if it 
is not too late already.

The diff below uses acpipci / _CRS also with old ACPI versions if running 
on a hypervisor. I think the chance that it will break unrelated systems 
is low. It does not change behavior on vmd, where no acpi attaches at all.

ok?


diff --git a/sys/arch/amd64/pci/acpipci.c b/sys/arch/amd64/pci/acpipci.c
index 51cd1360383..2e3236772bb 100644
--- a/sys/arch/amd64/pci/acpipci.c
+++ b/sys/arch/amd64/pci/acpipci.c
@@ -152,7 +152,7 @@ acpipci_attach(struct device *parent, struct device *self, void *aux)
 
 	aml_parse_resource(&res, acpipci_parse_resources, sc);
 
-	if (sc->sc_acpi->sc_major < 5) {
+	if (sc->sc_acpi->sc_major < 5 && (cpu_ecxfeature & CPUIDECX_HV) == 0) {
 		extent_destroy(sc->sc_ioex);
 		extent_destroy(sc->sc_memex);