From: Mark Kettenis Subject: Re: virtio_pci change for SEV To: Stefan Fritsch Cc: tech@openbsd.org Date: Mon, 28 Oct 2024 12:22:13 +0100 > Date: Mon, 28 Oct 2024 10:34:32 +0100 (CET) > From: Stefan Fritsch > > ping? Sorry. But the comments still make no sense to me. The different bus_dma(9) implementations do wildly different things on different architectures. Maybe instead of making a statement about our bus_dma(9) implementation here that I'm not sure is true. Just write something like: /* XXX Without this SEV doesn't work with a KVM/qemu hypervisor on amd64. */ for the VIRTIO_F_ACCESS_PLATFORM bit? And can we defer the VIRTIO_F_ORDER_PLATFORM bit for now? I think that should be done in a separate diff and mixing the two in a single discussion isn't helping in understanding what's going on. Cheers, Mark > On Wed, 9 Oct 2024, Stefan Fritsch wrote: > > > On Tue, 8 Oct 2024, Mark Kettenis wrote: > > > > > > Date: Tue, 8 Oct 2024 09:14:29 +0200 (CEST) > > > > From: Stefan Fritsch > > > > > > > > Hi, > > > > > > > > we always use the standard pci busdma machinery. Therefore we should > > > > negotiate the VIRTIO_F_ACCESS_PLATFORM feature if it is offered. > > > > Strictly speaking we should bypass any existing iommus if the host > > > > does not offer this feature, but this no regression and we can do that > > > > when/if we introduce our own busdma tag for virtio. > > > > > > > > Accepting VIRTIO_F_ACCESS_PLATFORM is required for SEV on KVM/qemu. > > > > > > > > ok? > > > > > > Sorry, but your explanation makes no sense. What does "the standard > > > pci busdma machinery" mean? I think it means different things on > > > different architectures. > > > > > > The description of the VIRTIO_F_ACCESS_PLATFORM flag is very vague. I > > > think the only possible interpretation is that not setting the flag > > > means "we're Linux and you can make assumptions and optimize things". > > > So indeed setting that flag uncodinitionally may make sense. But we > > > need a better comment. > > > > Yes, the description is rather vague. I interpret it that having it > > enabled means we should do like we would do on real hardware WRT address > > translation / IOMMUs / cache flushes, while with it disabled we should > > always use physical addresses in the rings, even if there seems to be an > > iommu. For the IOMMU part, I know that this is how qemu/KVM work in > > practice. > > > > What "like on real hardware" means exactly is of course architecture > > dependent, but so is the meaning of the feature flag. > > > > There is also VIRTIO_F_ORDER_PLATFORM with an equally vague description. I > > think we should also set this flag as we don't use any optimized weaker > > memory barriers. But in practice, this seems to be only relevant for > > virtio implemented in hardware/FPGAs like on some Mellanox NICs. > > > > Updated diff below that sets both flags and with more verbose > > descriptions. > > > > ok? > > > > Cheers, > > Stefan > > > > commit 3a4e0ac06eec036ec17616b88716701f3cf641cc > > Author: Stefan Fritsch > > Date: Mon Sep 16 16:02:35 2024 +0200 > > > > virtio_pci: Negotiate ACCESS_PLATTFORM/ORDER_PLATFORM features > > > > We always use the standard pci access mechanisms as we would on real > > hardware. Therefore we should negotiate the VIRTIO_F_ACCESS_PLATFORM > > feature if it is offered. Strictly speaking we should bypass any > > existing iommus if the host does not offer this feature, but this no > > regression and is left for later. > > > > Also, we use the memory ordering / barriers as provided by > > bus_dmamap_sync for real hardware. Therefore, we should negotiate the > > VIRTIO_F_ORDER_PLATFORM feature. > > > > Accepting VIRTIO_F_ACCESS_PLATFORM is required for SEV on KVM/qemu. > > > > diff --git a/sys/dev/pci/virtio_pci.c b/sys/dev/pci/virtio_pci.c > > index f9c8801ceb7..0302794f312 100644 > > --- a/sys/dev/pci/virtio_pci.c > > +++ b/sys/dev/pci/virtio_pci.c > > @@ -821,6 +821,24 @@ virtio_pci_negotiate_features_10(struct virtio_softc *vsc, > > uint64_t host, negotiated; > > > > vsc->sc_driver_features |= VIRTIO_F_VERSION_1; > > + /* > > + * We always use the normal platform specific pci busdma API. > > + * Therefore we use the access mechanisms and translations as > > + * defined by the platform we are running on (as if it were > > + * real hardware). > > + * > > + * XXX If the hypervisor does not offer VIRTIO_F_ACCESS_PLATFORM > > + * XXX we should bypass any iommu that is present. > > + */ > > + vsc->sc_driver_features |= VIRTIO_F_ACCESS_PLATFORM; > > + /* > > + * We always use pci specific memory ordering as implemented by > > + * the bus_dmamap_sync(). > > + * > > + * XXX If the hypervisor does not offer VIRTIO_F_ORDER_PLATFORM, > > + * XXX we could use weaker barriers. > > + */ > > + vsc->sc_driver_features |= VIRTIO_F_ORDER_PLATFORM; > > /* notify on empty is 0.9 only */ > > vsc->sc_driver_features &= ~VIRTIO_F_NOTIFY_ON_EMPTY; > > CWRITE(sc, device_feature_select, 0); > > > >