From: Mike Larkin <mlarkin@nested.page>
Subject: Re: SEV-ES multiprocessor support / handle WBINDV
To: Stefan Fritsch <sf@sfritsch.de>
Cc: tech@openbsd.org
Date: Tue, 25 Nov 2025 14:33:45 -0800

On Tue, Nov 25, 2025 at 10:02:27PM +0100, Stefan Fritsch wrote:
> On Tue, 25 Nov 2025, Mike Larkin wrote:
>
> > On Tue, Nov 25, 2025 at 08:02:28PM +0100, Stefan Fritsch wrote:
> > > Hi Mike,
> > >
> > > On Tue, 25 Nov 2025, Mike Larkin wrote:
> > >
> > > > On Tue, Nov 25, 2025 at 10:41:45AM +0100, Stefan Fritsch wrote:
> > > > > Hi,
> > > > >
> > > > > On Thu, 18 Sep 2025, Stefan Fritsch wrote:
> > > > > > one remaining problem with SEV-ES is that we don't support multiprocessor
> > > > >
> > > > > for multiprocessor support with SEV-ES, we need to handle WBINDV in the
> > > > > VC trap handler.
> > > > >
> > > > > This is the the first part of the larger diff that I have sent in the
> > > > > mail quoted above.
> > > > >
> > > > > ok?
> > > > >
> > > >
> > > > I don't understand this. We're saying if the guest does a WBINVD, we inject that
> > > > back into the guest via a #VC, but then in the trap handler we ... do nothing?
> > > > We just advance %rip?
> > > >
> > > > What happened to the guest's desired wbinvd/wbnoinvd?
> > >
> > > Sorry, I should have sent a diff with longer context. After the switch
> > > statement we copy the exit code to the GHCB and then exit to the
> > > hypervisor. As WBINVD does not take any arguments, no more information or
> > > handling is neccessary. Diff with more context below.
> > >
> > > Cheers,
> > > Stefan
> > >
> >
> > I'm still not seeing it. Sorry if I'm being dense here.
>
> These are good questions. I think we are looking at this from different
> angles, me from OpenBSD on linux/KVM, you from OpenBSD on vmm, therefore
> the misunderstanding. Sorry about that.
>
> > The context below helps. So we take a #VC trap for our (guest's) own
> > WBINVD/WBNOINVD, then call vmgexit and end up in svm_handle_vmgexit in
> > vmm_machdep.c . It's unclear from this diff whether or not we are using the
> > GHCB MSR protocol, but regardless if we are or aren't, there is no provision
> > for handling these instructions in vmm in either case. So we take a needless
> > exit and then return back to the guest after making a round-trip journey to
> > vmm to ... do nothing?
>
> vmm does not set SVM_INTERCEPT_WBINVD and we don't hit this code path. But
> KVM does set the intercept. However, OpenBSD running on a single CPU does
> not seem to call WBINVD, and we don't hit it on KVM, yet. But we will when
> we enable MP on SEV-ES.
>
> KVM also does something with the exit. If I understand the code correctly,
> if certain PCI devices are passed through to the guest and if the VCPU has
> been migrated to a different CPU recently, KVM needs to do the WBINVD both
> on the previous and the current CPU.
>
> > Since this is just a wbinvd, why don't we just handle it in the #VC trap
> > handler? There should not be any side effects of handling it ourselves here.
> >
> > Eg, in the case SVM_VMEXIT_WBINVD below, just call wbinvd, advance
> > tf_rip, and then iret? (maybe special case needed for wbnoinvd?)
> >
> > What am I missing?
>
> If we called wbinvd in the vc trap handler directly, we would get a double
> fault. I don't know why AMD designed this as non-automatic exit and why
> the guest's vc trap handler may want to do something different than just
> pass the exit on to the hypervisor. But we need to deal with it.
>

of course. that makes sense. the diff makes sense now.

after you commit, I'm going to add some commentary in this area around what
vmm(4) is doing here so that in 6 months when I look at this again I'm not
confused :)

ok mlarkin on the diff as-is, if you're still waiting on oks.

-ml

> Cheers,
> Stefan
>
> >
> > -ml
> >
> > > --- a/sys/arch/amd64/amd64/trap.c
> > > +++ b/sys/arch/amd64/amd64/trap.c
> > > @@ -428,40 +428,50 @@ vctrap(struct trapframe *frame, int user, int *sig, int *code)
> > > ...
> > > +	case SVM_VMEXIT_WBINVD:
> > > +		/*
> > > +		 * There is no special GHCB request for WBNOINVD.
> > > +		 * Signal WBINVD to emulate WBNOINVD.
> > > +		 */
> > > +		if (*rip == 0xf3)
> > > +			frame->tf_rip += 3;
> > > +		else
> > > +			frame->tf_rip += 2;
> > > +		break;
> > >  	default:
> > >  		panic("invalid exit code 0x%llx", ghcb_regs.exitcode);
> > >  	}
> > >
> > >  	/* Always required */
> > >  	ghcb_sync_val(GHCB_SW_EXITCODE, GHCB_SZ64, &syncout);
> > >  	ghcb_sync_val(GHCB_SW_EXITINFO1, GHCB_SZ64, &syncout);
> > >  	ghcb_sync_val(GHCB_SW_EXITINFO2, GHCB_SZ64, &syncout);
> > >
> > >  	/* Sync out to GHCB */
> > >  	ghcb = (struct ghcb_sa *)ghcb_vaddr;
> > >  	ghcb_sync_out(frame, &ghcb_regs, ghcb, &syncout);
> > >
> > >  	wrmsr(MSR_SEV_GHCB, ghcb_paddr);
> > >
> > >  	/* Call hypervisor. */
> > >  	vmgexit();
> > >
> > >  	/* Verify response */
> > >  	if (ghcb_verify_bm(ghcb->valid_bitmap, syncin.valid_bitmap)) {
> > >
> > >
> >