From: Mark Kettenis <mark.kettenis@xs4all.nl>
Subject: Re: virtio_pci change for SEV
To: Stefan Fritsch <sf@openbsd.org>
Cc: tech@openbsd.org
Date: Mon, 28 Oct 2024 12:22:13 +0100

> Date: Mon, 28 Oct 2024 10:34:32 +0100 (CET)
> From: Stefan Fritsch <sf@openbsd.org>
> 
> ping?

Sorry.  But the comments still make no sense to me.  The different
bus_dma(9) implementations do wildly different things on different
architectures.

Maybe instead of making a statement about our bus_dma(9)
implementation here that I'm not sure is true.  Just write something
like:

/* XXX Without this SEV doesn't work with a KVM/qemu hypervisor on amd64. */

for the VIRTIO_F_ACCESS_PLATFORM bit?

And can we defer the VIRTIO_F_ORDER_PLATFORM bit for now?  I think
that should be done in a separate diff and mixing the two in a single
discussion isn't helping in understanding what's going on.

Cheers,

Mark

> On Wed, 9 Oct 2024, Stefan Fritsch wrote:
> 
> > On Tue, 8 Oct 2024, Mark Kettenis wrote:
> > 
> > > > Date: Tue, 8 Oct 2024 09:14:29 +0200 (CEST)
> > > > From: Stefan Fritsch <sf@openbsd.org>
> > > > 
> > > > Hi,
> > > > 
> > > > we always use the standard pci busdma machinery. Therefore we should
> > > > negotiate the VIRTIO_F_ACCESS_PLATFORM feature if it is offered.
> > > > Strictly speaking we should bypass any existing iommus if the host 
> > > > does not offer this feature, but this no regression and we can do that 
> > > > when/if we introduce our own busdma tag for virtio.
> > > > 
> > > > Accepting VIRTIO_F_ACCESS_PLATFORM is required for SEV on KVM/qemu.
> > > > 
> > > > ok?
> > > 
> > > Sorry, but your explanation makes no sense.  What does "the standard
> > > pci busdma machinery" mean?  I think it means different things on
> > > different architectures.
> > > 
> > > The description of the VIRTIO_F_ACCESS_PLATFORM flag is very vague.  I
> > > think the only possible interpretation is that not setting the flag
> > > means "we're Linux and you can make assumptions and optimize things".
> > > So indeed setting that flag uncodinitionally may make sense.  But we
> > > need a better comment.
> > 
> > Yes, the description is rather vague. I interpret it that having it 
> > enabled means we should do like we would do on real hardware WRT address 
> > translation / IOMMUs / cache flushes, while with it disabled we should 
> > always use physical addresses in the rings, even if there seems to be an 
> > iommu. For the IOMMU part, I know that this is how qemu/KVM work in 
> > practice.
> > 
> > What "like on real hardware" means exactly is of course architecture 
> > dependent, but so is the meaning of the feature flag.
> > 
> > There is also VIRTIO_F_ORDER_PLATFORM with an equally vague description. I 
> > think we should also set this flag as we don't use any optimized weaker 
> > memory barriers. But in practice, this seems to be only relevant for 
> > virtio implemented in hardware/FPGAs like on some Mellanox NICs.
> > 
> > Updated diff below that sets both flags and with more verbose 
> > descriptions.
> > 
> > ok?
> > 
> > Cheers,
> > Stefan
> > 
> > commit 3a4e0ac06eec036ec17616b88716701f3cf641cc
> > Author: Stefan Fritsch <sf@sfritsch.de>
> > Date:   Mon Sep 16 16:02:35 2024 +0200
> > 
> >     virtio_pci: Negotiate ACCESS_PLATTFORM/ORDER_PLATFORM features
> >     
> >     We always use the standard pci access mechanisms as we would on real
> >     hardware. Therefore we should negotiate the VIRTIO_F_ACCESS_PLATFORM
> >     feature if it is offered.  Strictly speaking we should bypass any
> >     existing iommus if the host does not offer this feature, but this no
> >     regression and is left for later.
> >     
> >     Also, we use the memory ordering / barriers as provided by
> >     bus_dmamap_sync for real hardware. Therefore, we should negotiate the
> >     VIRTIO_F_ORDER_PLATFORM feature.
> >     
> >     Accepting VIRTIO_F_ACCESS_PLATFORM is required for SEV on KVM/qemu.
> > 
> > diff --git a/sys/dev/pci/virtio_pci.c b/sys/dev/pci/virtio_pci.c
> > index f9c8801ceb7..0302794f312 100644
> > --- a/sys/dev/pci/virtio_pci.c
> > +++ b/sys/dev/pci/virtio_pci.c
> > @@ -821,6 +821,24 @@ virtio_pci_negotiate_features_10(struct virtio_softc *vsc,
> >  	uint64_t host, negotiated;
> >  
> >  	vsc->sc_driver_features |= VIRTIO_F_VERSION_1;
> > +	/*
> > +	 * We always use the normal platform specific pci busdma API.
> > +	 * Therefore we use the access mechanisms and translations as
> > +	 * defined by the platform we are running on (as if it were
> > +	 * real hardware).
> > +	 *
> > +	 * XXX If the hypervisor does not offer VIRTIO_F_ACCESS_PLATFORM
> > +	 * XXX we should bypass any iommu that is present.
> > +	 */
> > +	vsc->sc_driver_features |= VIRTIO_F_ACCESS_PLATFORM;
> > +	/*
> > +	 * We always use pci specific memory ordering as implemented by
> > +	 * the bus_dmamap_sync().
> > +	 *
> > +	 * XXX If the hypervisor does not offer VIRTIO_F_ORDER_PLATFORM,
> > +	 * XXX we could use weaker barriers.
> > +	 */
> > +	vsc->sc_driver_features |= VIRTIO_F_ORDER_PLATFORM;
> >  	/* notify on empty is 0.9 only */
> >  	vsc->sc_driver_features &= ~VIRTIO_F_NOTIFY_ON_EMPTY;
> >  	CWRITE(sc, device_feature_select, 0);
> > 
> 
>