From: Mark Kettenis <mark.kettenis@xs4all.nl>
Subject: Re: amd64/fpu: Avoid multiple FPU resets
To: Philip Guenther <guenther@gmail.com>
Cc: cludwig@genua.de, tech@openbsd.org
Date: Fri, 13 Jun 2025 23:46:06 +0200

> From: Philip Guenther <guenther@gmail.com>
> Date: Fri, 13 Jun 2025 11:37:03 -0700
> 
> On Fri, Jun 13, 2025 at 5:32 AM Christian Ludwig <cludwig@genua.de> wrote:
> > All kernel crypto code follows the scheme:
> >
> >         for (objects) {
> >                 fpu_kernel_enter();
> >                 ...
> >                 fpu_kernel_exit();
> >         }
> >
> > In every iteration, fpu_kernel_exit() resets the FPU state and
> > fpu_kernel_enter() resets it, again. FPU resets are expensive on some
> > platforms. Doing the operation twice per loop iteration is clearly not
> > necessary.
> >
> > The FPU is always in one of two states when we reach fpu_kernel_enter().
> > It either holds user state when CPUPF_USERXSTATE is set or it is in
> > reset state already. The context switching code and signal code follow
> > this assumption, too. So we can simply drop resetting the FPU in
> > fpu_kernel_enter() when it does not hold user state.
> 
> Hmm, yes, all the places that clear CPUF_USERXSTATE reset the state.
> 
> Does fpu_kernel_enter() get used from interrupt context?  Do we have
> to worry about an interrupt occurring between the clearing of the flag
> and the resetting of the state?

I'm pretty sure the intention was that you can onlu use
fpu_kernel_enter() from process context.  But I don't think we
documented this anywhere and we don't enforce this.  Should we stick
an assertwaitok() in there?