From: Martin Pieuchot Subject: Re: anon pool cache To: Mark Kettenis Cc: dlg@openbsd.org, tech@openbsd.org Date: Wed, 16 Apr 2025 11:17:52 +0200 On 14/04/25(Mon) 23:25, Mark Kettenis wrote: > > Date: Mon, 14 Apr 2025 09:48:09 +0200 > > From: Martin Pieuchot > > > > On 10/03/25(Mon) 20:04, Martin Pieuchot wrote: > > > The mutex of the anon pool is one of the most contended global locks in > > > UVM. Diff below makes it use a pool cache on MP kernels. Without this, > > > running the upper part of the fault handler in parallel doesn't improve > > > anything. This change already gives a small boost on its own. > > > > > > Note that on LP64 "struct vm_anon" needs to grow a little bit... > > > > > > ok? > > > > This also help single-threaded performances... Anyone? > > In principle ok kettenis@ > > > > Index: uvm/uvm_anon.c > > > =================================================================== > > > RCS file: /cvs/src/sys/uvm/uvm_anon.c,v > > > diff -u -p -r1.62 uvm_anon.c > > > --- uvm/uvm_anon.c 10 Mar 2025 14:13:58 -0000 1.62 > > > +++ uvm/uvm_anon.c 10 Mar 2025 18:45:51 -0000 > > > @@ -50,6 +50,14 @@ uvm_anon_init(void) > > > pool_sethiwat(&uvm_anon_pool, uvmexp.free / 16); > > > } > > > > > > +void > > > +uvm_anon_init_percpu(void) > > > +{ > > > +#ifdef MULTIPROCESSOR > > > + pool_cache_init(&uvm_anon_pool); > > > +#endif > > > +} > > > + > > > /* > > > * uvm_analloc: allocate a new anon. > > > * > > > Index: uvm/uvm_anon.h > > > =================================================================== > > > RCS file: /cvs/src/sys/uvm/uvm_anon.h,v > > > diff -u -p -r1.22 uvm_anon.h > > > --- uvm/uvm_anon.h 19 Jan 2021 13:21:36 -0000 1.22 > > > +++ uvm/uvm_anon.h 10 Mar 2025 18:46:25 -0000 > > > @@ -47,6 +47,10 @@ struct vm_anon { > > > * Drum swap slot # (if != 0) [if we hold an_page, PG_BUSY] > > > */ > > > int an_swslot; > > > + > > > +#if defined(MULTIPROCESSOR) && defined(__LP64__) > > > + long unused; /* to match pool_cache_item's size */ > > > +#endif > > It took me a while to realize what you're trying to say here. But the > per-CPU caching code requires to pool item to be at least the size of > struct pool_cache_item. SO maybe spell it out like that? I used that wording in comments. > That said, maybe we should add a PR_PERCPUCACHE flag to pool_init() > and bump up the size as required? And check that the flag is set in > pool_cache_init() of course. I don't mind. I let dlg@ decides what is the appropriate for this API.