From: Greg Steuck Subject: Re: arm64 -fret-clean attempt To: "Theo de Raadt" Cc: tech@cvs.openbsd.org Date: Sat, 13 Jul 2024 11:08:57 +0200 Greg Steuck writes: > In case somebody wants to stare at the differences without waiting for > clang to build, behold > > --- good-direct.s Sat Jul 13 00:30:39 2024 > +++ bad-direct.s Sat Jul 13 00:29:50 2024 > @@ -40,7 +40,8 @@ > stp xzr, x19, [x0] > .cfi_def_cfa wsp, 32 > ldp x19, x15, [sp, #16] // 16-byte Folded Reload > - ldp x29, x30, [sp], #32 // 16-byte Folded Reload > + ldp x29, x30, [sp] // 16-byte Folded Reload > + str xzr, [sp, #32]! // 8-byte Folded Spill I believe the code is mangled here, the below is my interpretation of what these instructions do as simpler single-value sequential actions. I was using https://devblogs.microsoft.com/oldnewthing/20220728-00/?p=106912 as a decoder ring: - ldp x29, x30, [sp], #32 // 16-byte Folded Reload x29 <- [sp] x30 <- [sp+8] sp <- sp + 32 + ldp x29, x30, [sp] // 16-byte Folded Reload + str xzr, [sp, #32]! // 8-byte Folded Spil x29 <- [sp] x30 <- [sp+8] [sp + 32] <- 0 sp <- sp + 32 Since the point of the patch is to modify the code such that x30 is overwritten on the stack, the code we really want would be the following: x29 <- [sp] x30 <- [sp+8] [sp + 8] <- 0 sp <- sp + 32 Thanks Greg