From: Alexander Bluhm Subject: Re: Use acpipci in VMs To: Dave Voutila Cc: Mark Kettenis , Stefan Fritsch , tech@openbsd.org Date: Mon, 31 Mar 2025 17:59:34 +0200 On Mon, Mar 31, 2025 at 11:35:16AM -0400, Dave Voutila wrote: > Mark Kettenis writes: > > >> Date: Fri, 21 Mar 2025 14:24:08 +0100 (CET) > >> From: Stefan Fritsch > > > > Hi Stefan, > > > >> Hi, > >> > >> On Sun, 16 Mar 2025, Stefan Fritsch wrote: > >> > there were some reports that vio on KVM/qemu sometimes panics with > >> > > >> > vq_size not power of two: 65535 > >> > > >> > but I could never reproduce it. bluhm@ now got me a test setup where the > >> > bsd kernel is PXE booted on qemu in 440fx mode, and there it is > >> > reproducible. > >> > > >> > After some debugging it seems that seabios or ipxe maps the PCI BARs at > >> > 0x380000000000-0x380080000000 which is outside the allowed range in > >> > pci_init_extents(). On the other hand, in 440fx mode, qemu seems to > >> > produce ACPI 1.x tables and there is a check in acpipci_attach() that for > >> > ACPI < 5.x, the PCI infos from _CRS are not used. OpenBSD will then > >> > disable the BARs and when mapping them again in vio_attach(), it will > >> > sometimes choose adresses that do not work, reads return 0xff and writes > >> > are ignored. I guess this is becuase the address (in my case 0xbff14000) > >> > lies outside the PCI window of the emulated chipset. > >> > > >> > I have put dmesg, acpi tables and other info at > >> > https://www.sfritsch.de/~stf/vq-panic/ > >> > > >> > Qemu in q35 mode produces ACPI 3.x tables, so it may also be affected. > >> > > >> > There may be three ways to fix this: > >> > > >> > 1) increase the allowed range for pcimem in pci_init_extents(). This is > >> > what the diff below does. > >> > > >> > 2) somehow make acpipci_attach() use the ACPI infos on qemu. I have > >> > verified that removing the version check fixes the issue. Since removing > >> > the version check seems to break many other systems, this would have to be > >> > a qemu specific quirk. > >> > > >> > 3) try to make OpenBSD reliably map the BARs somewhere where it works. Is > >> > there a way for OpenBSD to get the info where the PCI window is without > >> > trusting ACPI? > >> > > >> > I remember at least one report of this issue on i386. Any idea how to fix > >> > it there? > >> > >> Mark Patruck noticed that these issues seem to be caused by some > >> relatively recent changes in seabios. > >> > >> https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/message/R7FOQMMYWVX577QNIA2AKUAGOZKNJIAP/ > >> https://gitlab.com/qemu-project/seabios/-/commit/df9dd418b3b0e586cb208125094620fc7f90f23d > >> > >> A workaround seems to be to configure the VM with <= 3GB memory. > >> > >> The problem may become more wide-spread with 7.7, since we now default to > >> virtio 1.x, which uses MMIO on qemu, compared to virtio 0.9 which uses PIO > >> BARs. Therefore it would be nice to get a fix in before the release, if it > >> is not too late already. > >> > >> The diff below uses acpipci / _CRS also with old ACPI versions if running > >> on a hypervisor. I think the chance that it will break unrelated systems > >> is low. It does not change behavior on vmd, where no acpi attaches at all. > >> > >> ok? > > > > I hate these VM quirks. Why are folks still emulating hardware from > > the 1990's when running a modern OS? > > ...like vmd(8). /gasp /vomit /shudder > > At least our ACPI isn't broken (because it doesn't exist). I test various VM configurations on purpose. Yes, with other qemu settings this problem would vanish. But it is very hard for the user to correlate a panic in driver attachment with a specific setting in virt-manager. I had to debug this together with sf@. I prefer working OpenBSD guests in a range of VM environments over specific VM configuration. Of course than means more quirks in the guest. > > Anyway, not much we can do about that I guess. But maybe we can have > > a bit more consistency? We already have a check to enable MSI for > > QEMU. And since this really is a QEMU issue, maybe it would be better > > to use a PCI_SUBSYS_ID_REG check here? > > > >> diff --git a/sys/arch/amd64/pci/acpipci.c b/sys/arch/amd64/pci/acpipci.c > >> index 51cd1360383..2e3236772bb 100644 > >> --- a/sys/arch/amd64/pci/acpipci.c > >> +++ b/sys/arch/amd64/pci/acpipci.c > >> @@ -152,7 +152,7 @@ acpipci_attach(struct device *parent, struct device *self, void *aux) > >> > >> aml_parse_resource(&res, acpipci_parse_resources, sc); > >> > >> - if (sc->sc_acpi->sc_major < 5) { > >> + if (sc->sc_acpi->sc_major < 5 && (cpu_ecxfeature & CPUIDECX_HV) == 0) { > >> extent_destroy(sc->sc_ioex); > >> extent_destroy(sc->sc_memex); > >> > >> > >>