Index | Thread | Search

From:
Hans-Jörg Höxer <hshoexer@genua.de>
Subject:
Re: [EXT] Re: SEV-ES guest: locore #VC trap handling
To:
<tech@openbsd.org>
Date:
Tue, 24 Jun 2025 14:12:52 +0200

Download raw body.

Thread
  • Hans-Jörg Höxer:

    [EXT] Re: SEV-ES guest: locore #VC trap handling

  • Hi,
    
    see below for some more answers.
    
    On Fri, Jun 20, 2025 at 09:06:37PM +0200, Alexander Bluhm wrote:
    > ..
    > > > @@ -193,6 +194,58 @@ bi_size_ok:
    > > >  	pushl	$PSL_MBO
    > > >  	popfl
    > > >
    > > > +	/*
    > > > +	 * Setup temporary #VC trap handler, in case we are running
    > > > +	 * on an AMD CPU in SEV-ES guest mode.  Will be reset by
    > > > +	 * init_x86_64().
    > > > +	 * We are setting up two handlers:
    > > > +	 *
    > > > +	 * 1) locore_vc_trap32:  Triggered when we are running in
    > > > +	 *    32-bit legacy mode.
    > > > +	 *
    > > > +	 * 2) locore_vc_trap64:  Triggered when we are running in
    > > > +	 *    32-bit compatibility mode.
    > > > +	 *
    > > > +	 * The latter one is used by vmd(8).
    > > 
    > > Please clarify; *when* is this used by vmd? I believe you mean when
    > > we do a direct kernel launch? If not, then why do we need the 32 bit
    > > one?
    > 
    > There are two ways we may enter the kernel.  KMV/qemu uses special
    > Tiano-Core EFI implementation.  For vmm/vmd currently we support
    > direct kernel boot only.  hshoexer@ explained to me why we need
    > both 32 bit methods here, but I forgot.
    
    With direct kernel launch vmd(8) sets up compatibility mode:  This
    is basically long mode (64-bit) with 32-bit code being executed in a
    32-bit segment.  However, exceptions (and interrupts) are handled by
    "long mode rules".  Thus we need a 64-bit IDT entry that is only allowed
    to reference a long mode code segment.  Thus we need a 64-bit handler.
    
    On Linux/KVM we seem to actually run in legacy mode, thus there we need
    a 32-bit handler.
    
    I think in the long run we want to use EFI boot on vmd/vmm anyway.
    So we might be able to simplify that code in the future.
    
    > > >  	/* XXX merge these */
    > > >  	call	init_x86_64
    > > >  	call	main
    > > >
    > > > +	/* MSR Protocol Request Codes */
    > > > +#define MSRPROTO_CPUID_REQ	0x4
    > > > +#define MSRPROTO_TERM_REQ	0x100
    > > > +
    > > > +vc_cpuid64:
    > > > +	shll	$30, %eax		/* requested register */
    > > > +	orl	$MSRPROTO_CPUID_REQ, %eax
    > > > +	movl	%ebx, %edx		/* CPUID function */
    > > > +	movl	$MSR_SEV_GHCB, %ecx
    > > > +	wrmsr
    > > > +	rep vmmcall
    > > 
    > > Out of curiousity, why is the rep prefix needed here?
    > 
    > I don't know.  hshoexer?
    
    vmmcall and vmgexit are basically the same instruction.  In the APM
    vol2 vmmcall is defined as "0x0f 0x01 0xd9".  vmgexit is defined as
    "0xf2/0xf3 0x0f 0x01 0xd9".  With 0xf2 and 0xf3 being repne and rep
    prefixes.  As we do not have the vmgexit mnemonic in llvm yet, using
    "rep vmmcall" will produce the correct byte sequence.  I think I saw
    this in linux kernel code.
    
    Another option might be to use .byte.  And of cource, adding the vmgexit
    mnemonic to llvm.
    
    As far as I understand APM vol2 the only difference between vmmcall and
    vmgexit is, that vmmcall will raise #UD when the hypervisor does not
    have configured the vmmcall intercept.  And the guest will not exit.
    vmgexit will always exit the guest when SEV-ES is enabled.  When SEV-ES
    is not enabled -- or the CPU does not support SEV-ES -- vmgexit behaves
    like vmmcall.
    
    Take care,
    Hans-Joerg
    
  • Hans-Jörg Höxer:

    [EXT] Re: SEV-ES guest: locore #VC trap handling