Index | Thread | Search

From:
Stefan Fritsch <sf@openbsd.org>
Subject:
Re: virtio_pci change for SEV
To:
Mark Kettenis <mark.kettenis@xs4all.nl>
Cc:
tech@openbsd.org
Date:
Mon, 28 Oct 2024 10:34:32 +0100

Download raw body.

Thread
ping?

On Wed, 9 Oct 2024, Stefan Fritsch wrote:

> On Tue, 8 Oct 2024, Mark Kettenis wrote:
> 
> > > Date: Tue, 8 Oct 2024 09:14:29 +0200 (CEST)
> > > From: Stefan Fritsch <sf@openbsd.org>
> > > 
> > > Hi,
> > > 
> > > we always use the standard pci busdma machinery. Therefore we should
> > > negotiate the VIRTIO_F_ACCESS_PLATFORM feature if it is offered.
> > > Strictly speaking we should bypass any existing iommus if the host 
> > > does not offer this feature, but this no regression and we can do that 
> > > when/if we introduce our own busdma tag for virtio.
> > > 
> > > Accepting VIRTIO_F_ACCESS_PLATFORM is required for SEV on KVM/qemu.
> > > 
> > > ok?
> > 
> > Sorry, but your explanation makes no sense.  What does "the standard
> > pci busdma machinery" mean?  I think it means different things on
> > different architectures.
> > 
> > The description of the VIRTIO_F_ACCESS_PLATFORM flag is very vague.  I
> > think the only possible interpretation is that not setting the flag
> > means "we're Linux and you can make assumptions and optimize things".
> > So indeed setting that flag uncodinitionally may make sense.  But we
> > need a better comment.
> 
> Yes, the description is rather vague. I interpret it that having it 
> enabled means we should do like we would do on real hardware WRT address 
> translation / IOMMUs / cache flushes, while with it disabled we should 
> always use physical addresses in the rings, even if there seems to be an 
> iommu. For the IOMMU part, I know that this is how qemu/KVM work in 
> practice.
> 
> What "like on real hardware" means exactly is of course architecture 
> dependent, but so is the meaning of the feature flag.
> 
> There is also VIRTIO_F_ORDER_PLATFORM with an equally vague description. I 
> think we should also set this flag as we don't use any optimized weaker 
> memory barriers. But in practice, this seems to be only relevant for 
> virtio implemented in hardware/FPGAs like on some Mellanox NICs.
> 
> Updated diff below that sets both flags and with more verbose 
> descriptions.
> 
> ok?
> 
> Cheers,
> Stefan
> 
> commit 3a4e0ac06eec036ec17616b88716701f3cf641cc
> Author: Stefan Fritsch <sf@sfritsch.de>
> Date:   Mon Sep 16 16:02:35 2024 +0200
> 
>     virtio_pci: Negotiate ACCESS_PLATTFORM/ORDER_PLATFORM features
>     
>     We always use the standard pci access mechanisms as we would on real
>     hardware. Therefore we should negotiate the VIRTIO_F_ACCESS_PLATFORM
>     feature if it is offered.  Strictly speaking we should bypass any
>     existing iommus if the host does not offer this feature, but this no
>     regression and is left for later.
>     
>     Also, we use the memory ordering / barriers as provided by
>     bus_dmamap_sync for real hardware. Therefore, we should negotiate the
>     VIRTIO_F_ORDER_PLATFORM feature.
>     
>     Accepting VIRTIO_F_ACCESS_PLATFORM is required for SEV on KVM/qemu.
> 
> diff --git a/sys/dev/pci/virtio_pci.c b/sys/dev/pci/virtio_pci.c
> index f9c8801ceb7..0302794f312 100644
> --- a/sys/dev/pci/virtio_pci.c
> +++ b/sys/dev/pci/virtio_pci.c
> @@ -821,6 +821,24 @@ virtio_pci_negotiate_features_10(struct virtio_softc *vsc,
>  	uint64_t host, negotiated;
>  
>  	vsc->sc_driver_features |= VIRTIO_F_VERSION_1;
> +	/*
> +	 * We always use the normal platform specific pci busdma API.
> +	 * Therefore we use the access mechanisms and translations as
> +	 * defined by the platform we are running on (as if it were
> +	 * real hardware).
> +	 *
> +	 * XXX If the hypervisor does not offer VIRTIO_F_ACCESS_PLATFORM
> +	 * XXX we should bypass any iommu that is present.
> +	 */
> +	vsc->sc_driver_features |= VIRTIO_F_ACCESS_PLATFORM;
> +	/*
> +	 * We always use pci specific memory ordering as implemented by
> +	 * the bus_dmamap_sync().
> +	 *
> +	 * XXX If the hypervisor does not offer VIRTIO_F_ORDER_PLATFORM,
> +	 * XXX we could use weaker barriers.
> +	 */
> +	vsc->sc_driver_features |= VIRTIO_F_ORDER_PLATFORM;
>  	/* notify on empty is 0.9 only */
>  	vsc->sc_driver_features &= ~VIRTIO_F_NOTIFY_ON_EMPTY;
>  	CWRITE(sc, device_feature_select, 0);
>