Download raw body.
improve spinning in mtx_enter
On Mon, Mar 25, 2024 at 09:56:24PM +0100, Alexander Bluhm wrote:
> On Sun, Mar 24, 2024 at 11:53:14AM +0100, Mark Kettenis wrote:
> > > > If you zoom into kernel and search for mtx_enter, you see that
> > > > spinning time goes down from 13.1% to 10.7%
> >
> > I assume that's an amd64 machine? Can we test this on some other
> > architectures too before we commit it? I don't expect a regression,
> > but it would be interesting to see what the effect is on an
> > architectures that has better atomic operations than amd64.
>
> On powerpc64 this diff hangs as soon the machine is going to
> multiuser. READ_ONCE(mtx->mtx_owner) fixes it. The problem is
> that the volatile in the struct mutex definition is at the wrong
> place. mtx_owner variable must be volatile, not the object
> it is pointing to.
The volatile change makes sense to me, the placement of volatile here
indeed looks like an oversight.
I suspect the hang on powerpc64 is due to CPU_BUSY_CYCLE being an
empty statement there:
#define CPU_BUSY_CYCLE() do {} while (0)
Shouldn't powerpc64 and other affected archs (mips64, powerpc, maybe
m88k?) switch to:
#define CPU_BUSY_CYCLE() __asm volatile("" ::: "memory")
in the MULTIPROCESSOR case? Looks like powerpc64 might also be able
to use the "wait" instruction here but I can't test that (+cc
gkoehler).
> time make -j4 bsd without diff:
> 4m33.70s real 10m17.07s user 6m37.56s system
>
> time make -j4 bsd with diff below:
> 4m32.17s real 10m12.17s user 6m39.56s system
>
> Measurement is going up and down a bit, so it is more or less the
> same.
I managed to test this on riscv64 (Hifive Unmatched) and I get no
visible improvement there.
> > One potential issue with this diff is that the current code checks
> > whether we've panicked on every iteration of the loop. But with this
> > diff that doesn't happen. So a CPU that is already spinning wil no
> > longer be set free when the kernel panics. I don't think that's a
> > problem as long as we don't have IPL_IPI mutexes. It might actually
> > be a good thing. But it is a change in behaviour to watch out for.
>
> I think you refer to ths check in mtx_enter_try().
> if (panicstr || db_active)
> return (1);
>
> It has not been added to interrupt mtx_enter() when another CPU
> panics. After a panic "ddb> boot reboot" should not grab any lock.
> This still works.
>
> New diff with correct volatile in struct mutex.
> ok?
ok jca@ for the volatile change whenever you want to commit it (small
nitpicking below).
ok jca@ for the mtx_enter() diff once folks feel like it's been tested
on a large enough set of architectures.
> bluhm
>
> Index: kern/kern_lock.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_lock.c,v
> diff -u -p -r1.72 kern_lock.c
> --- kern/kern_lock.c 26 Apr 2022 15:31:14 -0000 1.72
> +++ kern/kern_lock.c 25 Mar 2024 20:35:05 -0000
> @@ -264,15 +264,17 @@ mtx_enter(struct mutex *mtx)
>
> spc->spc_spinning++;
> while (mtx_enter_try(mtx) == 0) {
> - CPU_BUSY_CYCLE();
> -
> + do {
> + CPU_BUSY_CYCLE();
> #ifdef MP_LOCKDEBUG
> - if (--nticks == 0) {
> - db_printf("%s: %p lock spun out\n", __func__, mtx);
> - db_enter();
> - nticks = __mp_lock_spinout;
> - }
> + if (--nticks == 0) {
> + db_printf("%s: %p lock spun out\n",
> + __func__, mtx);
> + db_enter();
> + nticks = __mp_lock_spinout;
> + }
> #endif
> + } while (mtx->mtx_owner != NULL);
> }
> spc->spc_spinning--;
> }
> Index: sys/mutex.h
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/sys/mutex.h,v
> diff -u -p -r1.20 mutex.h
> --- sys/mutex.h 3 Feb 2024 22:50:09 -0000 1.20
> +++ sys/mutex.h 25 Mar 2024 20:35:25 -0000
> @@ -40,7 +40,7 @@
> #include <sys/_lock.h>
>
> struct mutex {
> - volatile void *mtx_owner;
> + void *volatile mtx_owner;
I would add a space before volatile for clarity.
> int mtx_wantipl;
> int mtx_oldipl;
> #ifdef WITNESS
>
--
jca
improve spinning in mtx_enter