Download raw body.
powerpc64: isync isn't a memory barrier
> Date: Wed, 12 Jun 2024 00:00:46 -0400
> From: George Koehler <kernigh@gmail.com>
>
> On powerpc64, I want to stop using "isync" as a memory barrier.
> This diff uses "sync". I explain below.
>
> Index: arch/powerpc64/include/atomic.h
> ===================================================================
> RCS file: /cvs/src/sys/arch/powerpc64/include/atomic.h,v
> diff -u -p -r1.3 atomic.h
> --- arch/powerpc64/include/atomic.h 29 Aug 2022 02:01:18 -0000 1.3
> +++ arch/powerpc64/include/atomic.h 10 Jun 2024 19:53:48 -0000
> @@ -276,10 +276,10 @@ _atomic_addic_long_nv(volatile unsigned
> #define __membar(_f) do { __asm volatile(_f ::: "memory"); } while (0)
>
> #if defined(MULTIPROCESSOR) || !defined(_KERNEL)
> -#define membar_enter() __membar("isync")
> +#define membar_enter() __membar("sync")
> #define membar_exit() __membar("sync")
> #define membar_producer() __membar("sync")
> -#define membar_consumer() __membar("isync")
> +#define membar_consumer() __membar("sync")
> #define membar_sync() __membar("sync")
> #else
> #define membar_enter() __membar("")
>
> "isync" is an instruction barrier. PowerPC Rev. 1 said,
>
> | Executing an *isync* instruction ensures that all instructions
> | preceding the *isync* instruction have completed before the *isync*
> | instruction completes, except that memory accesses caused by those
> | instructions need not have been performed with respect to other
> | processors and mechanisms.
>
> So "isync" is no barrier for memory accesses (loads or stores). As
> long as our membar_consumer(9) is "isync", it might fail to order the
> preceding loads before the following loads. This "isync" in
> /sys/kern/kern_sig.c cursig() looks wrong,
>
> ps_siglist = READ_ONCE(pr->ps_siglist);
> membar_consumer();
> mask = SIGPENDING(p);
>
> The intent might be to load ps_siglist before mask, but powerpc64
> membar_consumer does "isync", might load them in the wrong order.
>
> Mac OS on macppc had used "isync" as a load barrier. The (Mac G4)
> MPC7450 manual said,
>
> | The *isync* instruction is not executed until all previous
> | instructions complete to the point where they cannot cause an
> | exception. The *isync* instruction does not wait for all pending
> | stores in the store queue to complete.
>
> Because it mentions only stores, a reader might believe that *isync*
> does complete loads. Also, the (Mac G5) IBM970FX manual said,
>
> | In addition, the *isync* instruction is often used as a _load
> | barrier_ to prevent any subsequent load (or store) instructions
> | from executing before previous load instruction have been
> | completed.
>
> I suspect that "isync" is not a load barrier on newer powerpc64
> processors like POWER9. One can still use "isync" to order a
> conditional branch before a memory access (as Power ISA 3.0 B, Book
> II, B.2.1.1 uses "isync" after acquiring a lock). Most of our
> membar_enter calls and some of our membar_consume calls are after a
> conditional branch.
>
> My 4-core POWER9 observed no change with this diff. An 8-core POWER9
> in bugs@ "powerpc64/pmap.c trouble report" was running this diff, but
> I don't know whether the diff helped, or the kernel got lucky.
>
> This diff uses "sync" to order loads and stores. Later, I might
> propose a 2nd diff to start using "lwsync" on powerpc64. I wouldn't
> use "lwsync" on macppc.
Linux doesn't seem to use "isync" as a barrier, so this is probably a
good idea.
powerpc64: isync isn't a memory barrier