Index | Thread | Search

From:
Mark Kettenis <mark.kettenis@xs4all.nl>
Subject:
Re: powerpc64: isync isn't a memory barrier
To:
George Koehler <kernigh@gmail.com>
Cc:
tech@openbsd.org
Date:
Wed, 12 Jun 2024 22:20:53 +0200

Download raw body.

Thread
> Date: Wed, 12 Jun 2024 00:00:46 -0400
> From: George Koehler <kernigh@gmail.com>
> 
> On powerpc64, I want to stop using "isync" as a memory barrier.
> This diff uses "sync".  I explain below.
> 
> Index: arch/powerpc64/include/atomic.h
> ===================================================================
> RCS file: /cvs/src/sys/arch/powerpc64/include/atomic.h,v
> diff -u -p -r1.3 atomic.h
> --- arch/powerpc64/include/atomic.h	29 Aug 2022 02:01:18 -0000	1.3
> +++ arch/powerpc64/include/atomic.h	10 Jun 2024 19:53:48 -0000
> @@ -276,10 +276,10 @@ _atomic_addic_long_nv(volatile unsigned 
>  #define __membar(_f) do { __asm volatile(_f ::: "memory"); } while (0)
>  
>  #if defined(MULTIPROCESSOR) || !defined(_KERNEL)
> -#define membar_enter()		__membar("isync")
> +#define membar_enter()		__membar("sync")
>  #define membar_exit()		__membar("sync")
>  #define membar_producer()	__membar("sync")
> -#define membar_consumer()	__membar("isync")
> +#define membar_consumer()	__membar("sync")
>  #define membar_sync()		__membar("sync")
>  #else
>  #define membar_enter()		__membar("")
> 
> "isync" is an instruction barrier.  PowerPC Rev. 1 said,
> 
> | Executing an *isync* instruction ensures that all instructions
> | preceding the *isync* instruction have completed before the *isync*
> | instruction completes, except that memory accesses caused by those
> | instructions need not have been performed with respect to other
> | processors and mechanisms.
> 
> So "isync" is no barrier for memory accesses (loads or stores).  As
> long as our membar_consumer(9) is "isync", it might fail to order the
> preceding loads before the following loads.  This "isync" in
> /sys/kern/kern_sig.c cursig() looks wrong,
> 
> 		ps_siglist = READ_ONCE(pr->ps_siglist);
> 		membar_consumer();
> 		mask = SIGPENDING(p);
> 
> The intent might be to load ps_siglist before mask, but powerpc64
> membar_consumer does "isync", might load them in the wrong order.
> 
> Mac OS on macppc had used "isync" as a load barrier.  The (Mac G4)
> MPC7450 manual said,
> 
> | The *isync* instruction is not executed until all previous
> | instructions complete to the point where they cannot cause an
> | exception.  The *isync* instruction does not wait for all pending
> | stores in the store queue to complete.
> 
> Because it mentions only stores, a reader might believe that *isync*
> does complete loads.  Also, the (Mac G5) IBM970FX manual said,
> 
> | In addition, the *isync* instruction is often used as a _load
> | barrier_ to prevent any subsequent load (or store) instructions
> | from executing before previous load instruction have been
> | completed.
> 
> I suspect that "isync" is not a load barrier on newer powerpc64
> processors like POWER9.  One can still use "isync" to order a
> conditional branch before a memory access (as Power ISA 3.0 B, Book
> II, B.2.1.1 uses "isync" after acquiring a lock).  Most of our
> membar_enter calls and some of our membar_consume calls are after a
> conditional branch.
> 
> My 4-core POWER9 observed no change with this diff.  An 8-core POWER9
> in bugs@ "powerpc64/pmap.c trouble report" was running this diff, but
> I don't know whether the diff helped, or the kernel got lucky.
> 
> This diff uses "sync" to order loads and stores.  Later, I might
> propose a 2nd diff to start using "lwsync" on powerpc64.  I wouldn't
> use "lwsync" on macppc.

Linux doesn't seem to use "isync" as a barrier, so this is probably a
good idea.