From: Philip Guenther Subject: Re: amd64/fpu: Avoid multiple FPU resets To: Mark Kettenis Cc: deraadt@openbsd.org, cludwig@genua.de, tech@openbsd.org Date: Sun, 15 Jun 2025 09:07:07 -0700 On Sun, Jun 15, 2025 at 1:24 AM Mark Kettenis wrote: > > > From: Philip Guenther > > Date: Sat, 14 Jun 2025 21:27:20 -0700 > > Hi Philip, > > > On Fri, Jun 13, 2025 at 5:49 PM Theo de Raadt wrote: > > > > > > Theo de Raadt wrote: > > > > > > > Mark Kettenis wrote: > > > > > > > > > > From: Philip Guenther > > > > > > Date: Fri, 13 Jun 2025 11:37:03 -0700 > > > > > > > > > > > > On Fri, Jun 13, 2025 at 5:32 AM Christian Ludwig wrote: > > > > > > > All kernel crypto code follows the scheme: > > > > > > > > > > > > > > for (objects) { > > > > > > > fpu_kernel_enter(); > > > > > > > ... > > > > > > > fpu_kernel_exit(); > > > > > > > } > > > > > > > > > > > > > > In every iteration, fpu_kernel_exit() resets the FPU state and > > > > > > > fpu_kernel_enter() resets it, again. FPU resets are expensive on some > > > > > > > platforms. Doing the operation twice per loop iteration is clearly not > > > > > > > necessary. > > > > > > > > > > > > > > The FPU is always in one of two states when we reach fpu_kernel_enter(). > > > > > > > It either holds user state when CPUPF_USERXSTATE is set or it is in > > > > > > > reset state already. The context switching code and signal code follow > > > > > > > this assumption, too. So we can simply drop resetting the FPU in > > > > > > > fpu_kernel_enter() when it does not hold user state. > > > > > > > > > > > > Hmm, yes, all the places that clear CPUF_USERXSTATE reset the state. > > > > > > > > > > > > Does fpu_kernel_enter() get used from interrupt context? Do we have > > > > > > to worry about an interrupt occurring between the clearing of the flag > > > > > > and the resetting of the state? > > > > > > > > > > I'm pretty sure the intention was that you can onlu use > > > > > fpu_kernel_enter() from process context. But I don't think we > > > > > documented this anywhere and we don't enforce this. Should we stick > > > > > an assertwaitok() in there? > > > > > > > > My point is has this covered pu some bugs. > > > > > > My point is has this covered up some bugs? [that is a question] > > > > I suspect there's a very small possibility of an amdgpu interrupt > > occurring during critical sections in sendsig() or sys_sigreturn() > > resulting in the user thread's FPU space being reset unexpectedly, and > > a slightly larger possibility of such an interrupt occurring during > > any kernel process-context FPU usage (IPsec? Softraid?) corrupting > > that kernel computation. > > > > Anyone using IPsec or softraid on their amdgpu system running X? > > Experience random IPsec or softraid corruption? > > > > > > So, given two goals: (1) permit safe fpu use by one-level of > > interrupts (amdgpu), and (2) avoid unnecessary fpuresets (what started > > this thread), I thinking we need 4 states: > > 1) has user xstate (the current CPUF_USERXSTATE flag): CPU registers > > are master copy of the user thread's xstate > > 2) unknown xstate: FPU may be dirty or still have weird mode and > > exception bits set by userspace > > 3) has safe, but unused xstate: fpureset() has been used since last in userspace > > 4) has kernel-proc xstate: CPU register are master copy of a kernel > > process-context operation > > > > and then an additional bit to hold "already in an interrupt using the > > FPU: panic if a nested interrupt tries to use the FPU (no where to > > save the xstate!)", possibly in the form of a const char* pointer to > > the __FILE__ of the outer interrupt, so we can see who's interrupting > > whom. > > > > fpu_kernel_enter() would > > * assertwaitok() > > * if state: > > 1) fpusave, set state 2, fpureset(), set state 4 > > 2) fpureset(), set state 4 > > 3) set state 4 > > 4) panic: recursive process-content FPU usage > > > > fpu_kernel_exit() would > > * assert state 4 > > * set state 3 > > > > New fpu_kernel_enter_intr(file = __FILE__) would: > > * assert curcpu()->ci_intr_fpu == NULL > > * curcpu()->ci_intr_fpu = file > > * if state: > > 1) fpusave, fpureset(), set state 3 > > 2) fpureset(), set state 3 > > 3) do nothing > > 4) fpusave() > > > > New fpu_kernel_exit_intr() would: > > * assert curcpu()->ci_intr_fpu != NULL > > * if state: > > 1 or 2) panic > > 3) do nothing > > 4) xstor > > * curcpu()->ci_intr_fpu = NULL; > > > > > > And everywhere clearing USERXSTATE needs to fpusave() _first_, then > > clear it (that's the 1->2 transition), and only then possibly do > > fpureset() to get to state 3 if it needs it (sendsig(), for example) > > > > > > Does that make sense? Should I push around a full diff or is there a > > clearly wrong understanding of the kernel FPU usages (use from nested > > interrupts?) that mean this is not sufficient? > > That sounds like a lot of complexity. Handling both process and interrupt contexts would add some, of course. > The idea that the kernel should only be allowed to use the FPU in > process context isn't an arbitrary choice. The Linux kernel has the > same rule: > > https://docs.kernel.org/core-api/floating-point.html#runtime-api > > Therefore I don't expect the amdgpu code to be using the FPU from > interrupt context. That would be a bug. Okay, then I think the original diff is correct, though maybe we add the assertwaitok() to fpu_kernel_enter() so we remain confident of that. Since the kernel's FPU usage is relatively limited in registers it touches, there would still be a potential gain to have fpu_kernel_exit() only clear the FPU registers we use, if that's cheaper than fpureset(), perhaps VZEROALL? For paranoia we could have a bit to track whether the kernel used the FPU so a full reset could be done in paths like sendsig()/setexec() that want to guarantee a fully clean FPU state before setting CPUF_USERXSTATE. Philip Guenther