Index | Thread | Search

From:
Martin Pieuchot <mpi@grenadille.net>
Subject:
Re: anon pool cache
To:
Mark Kettenis <mark.kettenis@xs4all.nl>
Cc:
dlg@openbsd.org, tech@openbsd.org
Date:
Wed, 16 Apr 2025 11:17:52 +0200

Download raw body.

Thread
On 14/04/25(Mon) 23:25, Mark Kettenis wrote:
> > Date: Mon, 14 Apr 2025 09:48:09 +0200
> > From: Martin Pieuchot <mpi@grenadille.net>
> > 
> > On 10/03/25(Mon) 20:04, Martin Pieuchot wrote:
> > > The mutex of the anon pool is one of the most contended global locks in
> > > UVM.  Diff below makes it use a pool cache on MP kernels.  Without this,
> > > running the upper part of the fault handler in parallel doesn't improve
> > > anything.  This change already gives a small boost on its own.
> > > 
> > > Note that on LP64 "struct vm_anon" needs to grow a little bit...
> > > 
> > > ok?
> > 
> > This also help single-threaded performances...  Anyone?
> 
> In principle ok kettenis@
> 
> > > Index: uvm/uvm_anon.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
> > > diff -u -p -r1.62 uvm_anon.c
> > > --- uvm/uvm_anon.c	10 Mar 2025 14:13:58 -0000	1.62
> > > +++ uvm/uvm_anon.c	10 Mar 2025 18:45:51 -0000
> > > @@ -50,6 +50,14 @@ uvm_anon_init(void)
> > >  	pool_sethiwat(&uvm_anon_pool, uvmexp.free / 16);
> > >  }
> > >  
> > > +void
> > > +uvm_anon_init_percpu(void)
> > > +{
> > > +#ifdef MULTIPROCESSOR
> > > +	pool_cache_init(&uvm_anon_pool);
> > > +#endif
> > > +}
> > > +
> > >  /*
> > >   * uvm_analloc: allocate a new anon.
> > >   *
> > > Index: uvm/uvm_anon.h
> > > ===================================================================
> > > RCS file: /cvs/src/sys/uvm/uvm_anon.h,v
> > > diff -u -p -r1.22 uvm_anon.h
> > > --- uvm/uvm_anon.h	19 Jan 2021 13:21:36 -0000	1.22
> > > +++ uvm/uvm_anon.h	10 Mar 2025 18:46:25 -0000
> > > @@ -47,6 +47,10 @@ struct vm_anon {
> > >  	 * Drum swap slot # (if != 0) [if we hold an_page, PG_BUSY]
> > >  	 */
> > >  	int an_swslot;
> > > +
> > > +#if defined(MULTIPROCESSOR) && defined(__LP64__)
> > > +	long unused;			/* to match pool_cache_item's size */
> > > +#endif
> 
> It took me a while to realize what you're trying to say here.  But the
> per-CPU caching code requires to pool item to be at least the size of
> struct pool_cache_item.  SO maybe spell it out like that?

I used that wording in comments.

> That said, maybe we should add a PR_PERCPUCACHE flag to pool_init()
> and bump up the size as required?  And check that the flag is set in
> pool_cache_init() of course.

I don't mind.  I let dlg@ decides what is the appropriate for this API.