From: Stefan Fritsch Subject: Re: virtio_pci change for SEV To: Mark Kettenis Cc: tech@openbsd.org Date: Wed, 9 Oct 2024 11:25:45 +0200 On Tue, 8 Oct 2024, Mark Kettenis wrote: > > Date: Tue, 8 Oct 2024 09:14:29 +0200 (CEST) > > From: Stefan Fritsch > > > > Hi, > > > > we always use the standard pci busdma machinery. Therefore we should > > negotiate the VIRTIO_F_ACCESS_PLATFORM feature if it is offered. > > Strictly speaking we should bypass any existing iommus if the host > > does not offer this feature, but this no regression and we can do that > > when/if we introduce our own busdma tag for virtio. > > > > Accepting VIRTIO_F_ACCESS_PLATFORM is required for SEV on KVM/qemu. > > > > ok? > > Sorry, but your explanation makes no sense. What does "the standard > pci busdma machinery" mean? I think it means different things on > different architectures. > > The description of the VIRTIO_F_ACCESS_PLATFORM flag is very vague. I > think the only possible interpretation is that not setting the flag > means "we're Linux and you can make assumptions and optimize things". > So indeed setting that flag uncodinitionally may make sense. But we > need a better comment. Yes, the description is rather vague. I interpret it that having it enabled means we should do like we would do on real hardware WRT address translation / IOMMUs / cache flushes, while with it disabled we should always use physical addresses in the rings, even if there seems to be an iommu. For the IOMMU part, I know that this is how qemu/KVM work in practice. What "like on real hardware" means exactly is of course architecture dependent, but so is the meaning of the feature flag. There is also VIRTIO_F_ORDER_PLATFORM with an equally vague description. I think we should also set this flag as we don't use any optimized weaker memory barriers. But in practice, this seems to be only relevant for virtio implemented in hardware/FPGAs like on some Mellanox NICs. Updated diff below that sets both flags and with more verbose descriptions. ok? Cheers, Stefan commit 3a4e0ac06eec036ec17616b88716701f3cf641cc Author: Stefan Fritsch Date: Mon Sep 16 16:02:35 2024 +0200 virtio_pci: Negotiate ACCESS_PLATTFORM/ORDER_PLATFORM features We always use the standard pci access mechanisms as we would on real hardware. Therefore we should negotiate the VIRTIO_F_ACCESS_PLATFORM feature if it is offered. Strictly speaking we should bypass any existing iommus if the host does not offer this feature, but this no regression and is left for later. Also, we use the memory ordering / barriers as provided by bus_dmamap_sync for real hardware. Therefore, we should negotiate the VIRTIO_F_ORDER_PLATFORM feature. Accepting VIRTIO_F_ACCESS_PLATFORM is required for SEV on KVM/qemu. diff --git a/sys/dev/pci/virtio_pci.c b/sys/dev/pci/virtio_pci.c index f9c8801ceb7..0302794f312 100644 --- a/sys/dev/pci/virtio_pci.c +++ b/sys/dev/pci/virtio_pci.c @@ -821,6 +821,24 @@ virtio_pci_negotiate_features_10(struct virtio_softc *vsc, uint64_t host, negotiated; vsc->sc_driver_features |= VIRTIO_F_VERSION_1; + /* + * We always use the normal platform specific pci busdma API. + * Therefore we use the access mechanisms and translations as + * defined by the platform we are running on (as if it were + * real hardware). + * + * XXX If the hypervisor does not offer VIRTIO_F_ACCESS_PLATFORM + * XXX we should bypass any iommu that is present. + */ + vsc->sc_driver_features |= VIRTIO_F_ACCESS_PLATFORM; + /* + * We always use pci specific memory ordering as implemented by + * the bus_dmamap_sync(). + * + * XXX If the hypervisor does not offer VIRTIO_F_ORDER_PLATFORM, + * XXX we could use weaker barriers. + */ + vsc->sc_driver_features |= VIRTIO_F_ORDER_PLATFORM; /* notify on empty is 0.9 only */ vsc->sc_driver_features &= ~VIRTIO_F_NOTIFY_ON_EMPTY; CWRITE(sc, device_feature_select, 0);