Index | Thread | Search

From:
Mark Kettenis <mark.kettenis@xs4all.nl>
Subject:
Re: Upper bound for mtx_enter() exp. backoff
To:
Martin Pieuchot <mpi@grenadille.net>
Cc:
tech@openbsd.org
Date:
Sat, 07 Jun 2025 18:04:24 +0200

Download raw body.

Thread
> Date: Fri, 6 Jun 2025 11:32:46 +0200
> From: Martin Pieuchot <mpi@grenadille.net>
> 
> Turns out 64 is not enough to completely prevent hangs on Ampere Altra
> with 80 CPUs.
> 
> Instead of using a magic number pick the number of CPUs online.  On the
> Altra with 80 CPUs this makes the upper bound at 128.  Using the number
> of available CPUs should also reduce latency on smaller SMP machines.
> 
> With this I can purposely generate contention on the Ampere machine.
> 
> ok?

This makes some sense.  tedu@ already pointed out the "ncpusfound"
vs. "ncpusonline" thing.  We don't have an ncpusonline variable that
can be used here.  And "ncpusfound" might actually represent the
complexity of the hardware better; CPUs parked out of the scheduler
may still participate in the coherence fabric and impact forward
progress.

ok kettenis@

> Index: kern/kern_lock.c
> ===================================================================
> RCS file: /cvs/src/sys/kern/kern_lock.c,v
> diff -u -p -r1.78 kern_lock.c
> --- kern/kern_lock.c	31 May 2025 10:24:50 -0000	1.78
> +++ kern/kern_lock.c	3 Jun 2025 08:25:31 -0000
> @@ -37,6 +37,8 @@
>  int __mp_lock_spinout = INT_MAX;
>  #endif /* MP_LOCKDEBUG */
>  
> +extern int ncpusfound;
> +
>  /*
>   * Min & max numbers of "busy cycles" to waste before trying again to
>   * acquire a contended lock using an atomic operation.
> @@ -50,7 +52,7 @@ int __mp_lock_spinout = INT_MAX;
>   * enough to reduce (ideally avoid) cache line contention.
>   */
>  #define CPU_MIN_BUSY_CYCLES	1
> -#define CPU_MAX_BUSY_CYCLES	64
> +#define CPU_MAX_BUSY_CYCLES	ncpusfound
>  
>  #ifdef MULTIPROCESSOR
>  
> 
> 
>