From: Jeremie Courreges-Anglas Subject: Re: improve spinning in mtx_enter To: Alexander Bluhm Cc: Mark Kettenis , Martin Pieuchot , mjguzik@gmail.com, tech@openbsd.org, gkoehler@openbsd.org Date: Wed, 27 Mar 2024 19:50:44 +0100 On Mon, Mar 25, 2024 at 09:56:24PM +0100, Alexander Bluhm wrote: > On Sun, Mar 24, 2024 at 11:53:14AM +0100, Mark Kettenis wrote: > > > > If you zoom into kernel and search for mtx_enter, you see that > > > > spinning time goes down from 13.1% to 10.7% > > > > I assume that's an amd64 machine? Can we test this on some other > > architectures too before we commit it? I don't expect a regression, > > but it would be interesting to see what the effect is on an > > architectures that has better atomic operations than amd64. > > On powerpc64 this diff hangs as soon the machine is going to > multiuser. READ_ONCE(mtx->mtx_owner) fixes it. The problem is > that the volatile in the struct mutex definition is at the wrong > place. mtx_owner variable must be volatile, not the object > it is pointing to. The volatile change makes sense to me, the placement of volatile here indeed looks like an oversight. I suspect the hang on powerpc64 is due to CPU_BUSY_CYCLE being an empty statement there: #define CPU_BUSY_CYCLE() do {} while (0) Shouldn't powerpc64 and other affected archs (mips64, powerpc, maybe m88k?) switch to: #define CPU_BUSY_CYCLE() __asm volatile("" ::: "memory") in the MULTIPROCESSOR case? Looks like powerpc64 might also be able to use the "wait" instruction here but I can't test that (+cc gkoehler). > time make -j4 bsd without diff: > 4m33.70s real 10m17.07s user 6m37.56s system > > time make -j4 bsd with diff below: > 4m32.17s real 10m12.17s user 6m39.56s system > > Measurement is going up and down a bit, so it is more or less the > same. I managed to test this on riscv64 (Hifive Unmatched) and I get no visible improvement there. > > One potential issue with this diff is that the current code checks > > whether we've panicked on every iteration of the loop. But with this > > diff that doesn't happen. So a CPU that is already spinning wil no > > longer be set free when the kernel panics. I don't think that's a > > problem as long as we don't have IPL_IPI mutexes. It might actually > > be a good thing. But it is a change in behaviour to watch out for. > > I think you refer to ths check in mtx_enter_try(). > if (panicstr || db_active) > return (1); > > It has not been added to interrupt mtx_enter() when another CPU > panics. After a panic "ddb> boot reboot" should not grab any lock. > This still works. > > New diff with correct volatile in struct mutex. > ok? ok jca@ for the volatile change whenever you want to commit it (small nitpicking below). ok jca@ for the mtx_enter() diff once folks feel like it's been tested on a large enough set of architectures. > bluhm > > Index: kern/kern_lock.c > =================================================================== > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_lock.c,v > diff -u -p -r1.72 kern_lock.c > --- kern/kern_lock.c 26 Apr 2022 15:31:14 -0000 1.72 > +++ kern/kern_lock.c 25 Mar 2024 20:35:05 -0000 > @@ -264,15 +264,17 @@ mtx_enter(struct mutex *mtx) > > spc->spc_spinning++; > while (mtx_enter_try(mtx) == 0) { > - CPU_BUSY_CYCLE(); > - > + do { > + CPU_BUSY_CYCLE(); > #ifdef MP_LOCKDEBUG > - if (--nticks == 0) { > - db_printf("%s: %p lock spun out\n", __func__, mtx); > - db_enter(); > - nticks = __mp_lock_spinout; > - } > + if (--nticks == 0) { > + db_printf("%s: %p lock spun out\n", > + __func__, mtx); > + db_enter(); > + nticks = __mp_lock_spinout; > + } > #endif > + } while (mtx->mtx_owner != NULL); > } > spc->spc_spinning--; > } > Index: sys/mutex.h > =================================================================== > RCS file: /data/mirror/openbsd/cvs/src/sys/sys/mutex.h,v > diff -u -p -r1.20 mutex.h > --- sys/mutex.h 3 Feb 2024 22:50:09 -0000 1.20 > +++ sys/mutex.h 25 Mar 2024 20:35:25 -0000 > @@ -40,7 +40,7 @@ > #include > > struct mutex { > - volatile void *mtx_owner; > + void *volatile mtx_owner; I would add a space before volatile for clarity. > int mtx_wantipl; > int mtx_oldipl; > #ifdef WITNESS > -- jca