Index | Thread | Search

From:
Martin Pieuchot <mpi@openbsd.org>
Subject:
Re: amd64 pmap & resident_count
To:
Mark Kettenis <mark.kettenis@xs4all.nl>
Cc:
tech@openbsd.org
Date:
Wed, 28 Aug 2024 09:57:39 +0200

Download raw body.

Thread
  • Mark Kettenis:

    amd64 pmap & resident_count

  • On 27/08/24(Tue) 22:43, Mark Kettenis wrote:
    > > Date: Fri, 23 Aug 2024 16:19:37 +0200
    > > From: Martin Pieuchot <mpi@openbsd.org>
    > > 
    > > On a 24 CPUs amd64 machine building a kernel with -j24 always result in
    > > a corrupted "RES" `pmap_kernel()->pm_stats.resident_count' view in top(1).
    > > 
    > > Modifications to this field are (mostly) serialized by the pmap's mutex.
    > > Sadly pmap_enter() do not grab the mutex for the kernel's pmap.  Diff
    > > below changes that and fix the issue for me.  I believe it is easier and
    > > cheaper than using atomics everywhere.
    > > 
    > > ok?
    > 
    > Not sure this is a good idea.  Not grabbing the pmap mutex for the
    > kernel pmap was somewhat by design; I think at some point I convinced
    > myself that doing so could lead to deadlocks.
    
    I'm not convinced.  The splvm() in pmap_growkernel() is not enough to
    protect from multiple callers.
    
    > So yes, the statistics for the kernel pmap may end up corrupted.  That
    > shouldn't be a problem though, since these statistics are pretty much
    > meaningless for the kernel pmap.  I don't even think our various
    > architectures do the accounting in a consistent way.
    
    I want to keep them.  We have too few counters to see what's happening
    in the kernel pmap.
    
    > So maybe we should just not bother with these for the kernel pmap?
    > And rather than wrapping each of the pm_stats.resident_count and
    > pm_stats.wiredcount with a an
    > 
    >     if (pmap != pmap_kernel())
    > 
    > we should just change the few places where we actually use/report the
    > counts to report zero for the kernel map?
    
    No, I want to fix them.  That's what the last diff in the thread does.
    
    > > Index: arch/amd64/amd64/pmap.c
    > > ===================================================================
    > > RCS file: /cvs/src/sys/arch/amd64/amd64/pmap.c,v
    > > diff -u -p -r1.171 pmap.c
    > > --- arch/amd64/amd64/pmap.c	8 Aug 2024 15:57:22 -0000	1.171
    > > +++ arch/amd64/amd64/pmap.c	23 Aug 2024 14:13:46 -0000
    > > @@ -416,10 +416,6 @@ pmap_map_ptes(struct pmap *pmap)
    > >  
    > >  	KASSERT(pmap->pm_type != PMAP_TYPE_EPT);
    > >  
    > > -	/* the kernel's pmap is always accessible */
    > > -	if (pmap == pmap_kernel())
    > > -		return 0;
    > > -
    > >  	/*
    > >  	 * Lock the target map before switching to its page tables to
    > >  	 * guarantee other CPUs have finished changing the tables before
    > > @@ -427,6 +423,10 @@ pmap_map_ptes(struct pmap *pmap)
    > >  	 */
    > >  	mtx_enter(&pmap->pm_mtx);
    > >  
    > > +	/* the kernel's pmap is always accessible */
    > > +	if (pmap == pmap_kernel())
    > > +		return 0;
    > > +
    > >  	cr3 = rcr3();
    > >  	KASSERT((cr3 & CR3_PCID) == PCID_KERN ||
    > >  		(cr3 & CR3_PCID) == PCID_PROC);
    > > @@ -443,8 +443,7 @@ pmap_map_ptes(struct pmap *pmap)
    > >  void
    > >  pmap_unmap_ptes(struct pmap *pmap, paddr_t save_cr3)
    > >  {
    > > -	if (pmap != pmap_kernel())
    > > -		mtx_leave(&pmap->pm_mtx);
    > > +	mtx_leave(&pmap->pm_mtx);
    > >  
    > >  	if (save_cr3 != 0)
    > >  		lcr3(save_cr3);
    > > 
    > > 
    
    
    
  • Mark Kettenis:

    amd64 pmap & resident_count