Index | Thread | Search

From:
Yuichiro NAITO <naito.yuichiro@gmail.com>
Subject:
Re: PCI amd64: Enables MSI/MSI-X interrupts on QEMU i440fx chipset emulation
To:
sf@openbsd.org
Cc:
jsg@jsg.id.au, yasuoka@openbsd.org, mark.kettenis@xs4all.nl, kettenis@openbsd.org, tech@openbsd.org
Date:
Wed, 09 Oct 2024 10:53:52 +0900

Download raw body.

Thread
From: Stefan Fritsch <sf@openbsd.org>
Subject: Re: PCI amd64: Enables MSI/MSI-X interrupts on QEMU i440fx chipset emulation
Date: Mon, 7 Oct 2024 22:58:35 +0200 (CEST)

> On Mon, 7 Oct 2024, Jonathan Gray wrote:
> 
>> On Mon, Oct 07, 2024 at 05:04:26PM +0900, YASUOKA Masahiko wrote:
>> > On Mon, 07 Oct 2024 09:06:37 +0200
>> > Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> > >> Date: Mon, 07 Oct 2024 10:54:14 +0900 (JST)
>> > >> From: Yuichiro NAITO <naito.yuichiro@gmail.com>
>> > >> 
>> > >> Hi, this patch sets 'PCI_FLAGS_MSI_ENABLED' if the kernel runs on a QEMU
>> > >> virtual machine with i440fx chipset emulation. The i440fx is the default
>> > >> chipset on the QEMU so users will use it first.
>> > >> And then ixv(4) and iavf(4) will fail because MSI-X interrupts are not
>> > >> supported by the kernel.
>> > >> 
>> > >> The i440fx is quite old and the ACPI table is revision 1. We cannot know
>> > >> if it has MSI capability from the ACPI table. So I add the flag as a quirk.
>> > >> 
>> > >> The same quirk is implemented in NetBSD.
>> > >> 
>> > >> https://github.com/NetBSD/src/blob/affb7619d18b17a184eede63a4823f629a2893fc/sys/arch/x86/pci/pci_machdep.c#L273
>> > >> 
>> > >> FreeBSD seems to have the quirk but is used as the exception for blacklist
>> > >> devices. Probably they see the PCI device capability header of MSI/MSI-X.
>> > >> Until we see the same capability, my patch will be useful for the VF driver
>> > >> users.
>> > >> 
>> > >> OK?
>> > > 
>> > > I think this should be done using the subsystem vendir/id stuff like
>> > > was suggested by someone else in the thread.
>> > 
> 
> 
>> sf proposed checking the subsystem vendor id
>> https://marc.info/?l=openbsd-tech&m=172812592731721&w=2
>> your diff checks the wrong register, should be PCI_SUBSYS_ID_REG
> 
> I am ok with your diff. But freebsd and netbsd have two intel device ids 
> with that quirk. The other is for vmware, which likely does not have the 
> qumranet subsystem vendor id. I would also be ok with a diff that does it 
> in the same way as the other bsds.
> 
> I have checked that with your diff, I could remove the quirk in virtio_pci 
> on qemu and msix still works. However, we disable MSI if the host bridge 
> is not Intel/AMD/Nvidia. There are many other hypervisors than qemu and I 
> think this could be a problem if we removed the virtio_pci quirk. Maybe we 
> should simply trust the acpi tables on hypervisors, see diff below (in 
> addition to your diff)? Or trust them on all acpi >= 2 systems?

I'm afraid that the vmd doesn't offer an ACPI table to a guest OS so that
a guest OpenBSD kernel doesn't detect msix capability if we remove
the quirk in the virtio_pci. I think the quirk should be kept until
the vmd supports ACPI.

> Cheers,
> Stefan
> 
> diff --git a/sys/arch/amd64/pci/acpipci.c b/sys/arch/amd64/pci/acpipci.c
> index ecfb0921d46..d845b0e15bc 100644
> --- a/sys/arch/amd64/pci/acpipci.c
> +++ b/sys/arch/amd64/pci/acpipci.c
> @@ -202,16 +202,18 @@ acpipci_attach_bus(struct device *parent, struct acpipci_softc *sc)
>  		pba.pba_flags |= PCI_FLAGS_MSI_ENABLED;
>  
>  	/*
>  	 * Don't enable MSI on chipsets from low-end manufacturers
>  	 * like VIA and SiS.  We do this by looking at the host
>  	 * bridge, which should be device 0 function 0.
> +	 * Except on hypervisors, there we trust the acpi info.
>  	 */
>  	id = pci_conf_read(pba.pba_pc, tag, PCI_ID_REG);
>  	class = pci_conf_read(pba.pba_pc, tag, PCI_CLASS_REG);
> -	if (PCI_CLASS(class) == PCI_CLASS_BRIDGE &&
> +	if ((cpu_ecxfeature & CPUIDECX_HV) == 0 &&
> +	    PCI_CLASS(class) == PCI_CLASS_BRIDGE &&
>  	    PCI_SUBCLASS(class) != PCI_SUBCLASS_BRIDGE_HOST &&
>  	    PCI_VENDOR(id) != PCI_VENDOR_AMD &&
>  	    PCI_VENDOR(id) != PCI_VENDOR_NVIDIA &&
>  	    PCI_VENDOR(id) != PCI_VENDOR_INTEL)
>  		pba.pba_flags &= ~PCI_FLAGS_MSI_ENABLED;
>  
> 
> 
>> 
>> Index: sys/arch/amd64/pci/acpipci.c
>> ===================================================================
>> RCS file: /cvs/src/sys/arch/amd64/pci/acpipci.c,v
>> diff -u -p -r1.8 acpipci.c
>> --- sys/arch/amd64/pci/acpipci.c	13 May 2024 01:15:50 -0000	1.8
>> +++ sys/arch/amd64/pci/acpipci.c	7 Oct 2024 09:27:42 -0000
>> @@ -194,12 +194,18 @@ acpipci_attach_bus(struct device *parent
>>  	    (sc->sc_acpi->sc_fadt->iapc_boot_arch & FADT_NO_MSI) == 0)
>>  		pba.pba_flags |= PCI_FLAGS_MSI_ENABLED;
>>  
>> +	/* Enable MSI for QEMU claiming ACPI 1.0 */
>> +	tag = pci_make_tag(pba.pba_pc, sc->sc_bus, 0, 0);
>> +	id = pci_conf_read(pba.pba_pc, tag, PCI_SUBSYS_ID_REG);
>> +	if (sc->sc_acpi->sc_fadt->hdr.revision == 1 &&
>> +	    PCI_VENDOR(id) == PCI_VENDOR_QUMRANET)
>> +		pba.pba_flags |= PCI_FLAGS_MSI_ENABLED;
>> +
>>  	/*
>>  	 * Don't enable MSI on chipsets from low-end manufacturers
>>  	 * like VIA and SiS.  We do this by looking at the host
>>  	 * bridge, which should be device 0 function 0.
>>  	 */
>> -	tag = pci_make_tag(pba.pba_pc, sc->sc_bus, 0, 0);
>>  	id = pci_conf_read(pba.pba_pc, tag, PCI_ID_REG);
>>  	class = pci_conf_read(pba.pba_pc, tag, PCI_CLASS_REG);
>>  	if (PCI_CLASS(class) == PCI_CLASS_BRIDGE &&
>> Index: sys/arch/i386/pci/pci_machdep.c
>> ===================================================================
>> RCS file: /cvs/src/sys/arch/i386/pci/pci_machdep.c,v
>> diff -u -p -r1.87 pci_machdep.c
>> --- sys/arch/i386/pci/pci_machdep.c	11 Mar 2021 11:16:57 -0000	1.87
>> +++ sys/arch/i386/pci/pci_machdep.c	7 Oct 2024 09:26:40 -0000
>> @@ -324,6 +324,11 @@ pci_attach_hook(struct device *parent, s
>>  		break;
>>  	}
>>  
>> +	/* Enable MSI for QEMU */
>> +	id = pci_conf_read(pc, tag, PCI_SUBSYS_ID_REG);
>> +	if (PCI_VENDOR(id) == PCI_VENDOR_QUMRANET)
>> +		pba->pba_flags |= PCI_FLAGS_MSI_ENABLED;
>> +
>>  	/*
>>  	 * Don't enable MSI on a HyperTransport bus.  In order to
>>  	 * determine that bus 0 is a HyperTransport bus, we look at
>> 
> 

-- 
Yuichiro NAITO (naito.yuichiro@gmail.com)