From: Peter Hessler <phessler@theapt.org>
Subject: Re: Push UVM's pageqlock inside uvm_pagefree()
To: tech@openbsd.org
Date: Tue, 25 Feb 2025 12:38:08 +0100

On 2025 Feb 21 (Fri) at 10:55:42 +0100 (+0100), Martin Pieuchot wrote:
:On 20/02/25(Thu) 16:21, Martin Pieuchot wrote:
:> Hey Mark,
:> 
:> Thanks for the review.
:> 
:> On 19/02/25(Wed) 15:08, Mark Kettenis wrote:
:> > > Date: Wed, 19 Feb 2025 13:11:09 +0100
:> > > From: Martin Pieuchot <mpi@grenadille.net>
:> > > 
:> > > Here's the first step to reduce the contention on the global pageqlock
:> > > mutex.  The lock is necessary to guarantee the integrity of the global
:> > > LRUs.  So it is only required around uvm_pagedequeue().
:> > > 
:> > > This diff already makes a noticeable difference.  ok?
:> > 
:> > Possibly not ok.
:> > 
:> > Is it possible that uvm_pageclean() gets called simulatinously by two
:> > different CPUs for the same page?  I'm not 100% sure that isn't
:> > possible.  
:> 
:> If the page is associated with an `anon' or `uobject' we already assert
:> that the, so called, owner lock, is held.  So it is not possible.
:> 
:> >            But if you can convince me that that isn't possible, there
:> > are still a few issues:
:> > are still a few issues:
:> > 
:> > 1. Probably should remove the
:> > 
:> >      => caller must lock page queues if `pg' is managed
:> > 
:> >    bit from the comment.
:> 
:> Indeed.  
:> 
:> > 2. I think this means that uvmexp.wired-- now happens without holding
:> >    the uvmexp.pageqlock.  So that counter may go off and that in turn
:> >    may cause problems in the pagedaemon.  Maybe we need to use atomic
:> >    ops for it?
:> 
:> I agree, we do.
:> 
:> > 3. I think you're removing an optimization in uvn_flush().  That
:> >    function would accumulate cleaned pages and return them to
:> >    pmemrange in one go.  That potentially avoids lots of unlock/relock
:> >    cases.
:> 
:> This is now done via per-CPU magazines.  I committed a similar change to
:> uvm_anfree().  In my measurements it decreases contention on the fpageq
:> lock because we do not replenish a magazine, and give back pages to the
:> pmemrange allocator, every time uvn_flush() is called.
:> 
:> > 4. The current code has some cases that do:
:> > 
:> > 	uvm_lock_pageq();
:> >   	pmap_page_protect(pg, PROT_NONE);
:> >   	uvm_pagefree(pg);
:> > 	uvm_unlock_pageq();
:> > 
:> >    I'm wondering if that is trying to stop the case where we remove
:> >    the page from the pmap, but another CPU tries to access the page
:> >    and fault it back in.  So what prevents that from happening once
:> >    you no longer grab the lock?
:> 
:> I like your question.
:> 
:> Note that currently the pageq lock doesn't prevent another CPU to fault
:> between the PROT_NONE and uvm_pagefree().  When a fault occurs, the VA
:> is translated into an `anon' which might or might not be associated to a
:> physical page.
:> In the fault handler accessing `anon->an_page' is done while holding the
:> corresponding anon lock which is the lock of the amap it belongs to.
:> 
:> In other words, what prevents `pg' to be faulted back in is the anon
:> lock which is held during uvm_pageclean().
:> 
:> The same logic applies to managed pages linked to an uobj except that the
:> lock is `vmobjlock'.
:> 
:> Here's an updated diff with the comment removed and using atomic
:> operations for uvmexp.wired.
:
:I missed a chunk in the previous diff when rebasing.  Complete diff
:below.
:
:ok?
:

Tested on the arm64.p bulk builders, no issues seen.

-- 
One can't proceed from the informal to the formal by formal means.