Download raw body.
improve spinning in mtx_enter
> Date: Sun, 24 Mar 2024 11:09:12 +0100
> From: Martin Pieuchot <mpi@openbsd.org>
Alexander, your mail still (again?) falls into the spam trap. So I
didn't see your mail.
> On 21/03/24(Thu) 13:22, Alexander Bluhm wrote:
> > On Wed, Mar 20, 2024 at 01:43:18PM +0100, Mateusz Guzik wrote:
> > > few years back I noted that migration from MD mutex code to MI
> > > implementation happened to regress performance at least on amd64.
> > >
> > > The MI implementation fails to check if the lock is free before issuing
> > > CAS, and only spins once before attempts. This avoidably reduces
> > > performance.
> > >
> > > While more can be done here, bare minimun the code needs to NULL check,
> > > which I implemented below.
> > >
> > > Results on 7.5 (taken from snapshots) timing make -ss -j 16 in the
> > > kernel dir:
> > >
> > > before: 521.37s user 524.69s system 1080% cpu 1:36.79 total
> > > after: 522.76s user 486.87s system 1088% cpu 1:32.79 total
> > >
> > > That is about 4% reduction in total real time for a rather trivial
> > > change.
> >
> > I refactored the diff a bit to keep the while loop.
> >
> > With that I get 5% reduction in sys time during kernel build on a
> > 8 CPU machine. It has two sockets with 4 cores each.
> >
> > Flame graphs of kernel build are here:
> > http://bluhm.genua.de/perform/results/2024-03-19T23:41:35Z/2024-03-19T00%3A00%3A00Z/btrace/time_-lp_make_-CGENERIC.MP_-j8_-s-btrace-kstack.0.svg
> > http://bluhm.genua.de/perform/results/2024-03-19T23:41:35Z/patch-sys-mutex-spin.1/btrace/time_-lp_make_-CGENERIC.MP_-j8_-s-btrace-kstack.0.svg
> >
> > If you zoom into kernel and search for mtx_enter, you see that
> > spinning time goes down from 13.1% to 10.7%
I assume that's an amd64 machine? Can we test this on some other
architectures too before we commit it? I don't expect a regression,
but it would be interesting to see what the effect is on an
architectures that has better atomic operations than amd64.
> > Should we commit this?
One potential issue with this diff is that the current code checks
whether we've panicked on every iteration of the loop. But with this
diff that doesn't happen. So a CPU that is already spinning wil no
longer be set free when the kernel panics. I don't think that's a
problem as long as we don't have IPL_IPI mutexes. It might actually
be a good thing. But it is a change in behaviour to watch out for.
> ok mpi@
>
> > Index: kern/kern_lock.c
> > ===================================================================
> > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_lock.c,v
> > diff -u -p -r1.72 kern_lock.c
> > --- kern/kern_lock.c 26 Apr 2022 15:31:14 -0000 1.72
> > +++ kern/kern_lock.c 21 Mar 2024 12:11:20 -0000
> > @@ -264,15 +264,17 @@ mtx_enter(struct mutex *mtx)
> >
> > spc->spc_spinning++;
> > while (mtx_enter_try(mtx) == 0) {
> > - CPU_BUSY_CYCLE();
> > -
> > + do {
> > + CPU_BUSY_CYCLE();
> > #ifdef MP_LOCKDEBUG
> > - if (--nticks == 0) {
> > - db_printf("%s: %p lock spun out\n", __func__, mtx);
> > - db_enter();
> > - nticks = __mp_lock_spinout;
> > - }
> > + if (--nticks == 0) {
> > + db_printf("%s: %p lock spun out\n",
> > + __func__, mtx);
> > + db_enter();
> > + nticks = __mp_lock_spinout;
> > + }
> > #endif
> > + } while (mtx->mtx_owner != NULL);
> > }
> > spc->spc_spinning--;
> > }
> >
>
>
improve spinning in mtx_enter