Index | Thread | Search

From:
Mark Kettenis <mark.kettenis@xs4all.nl>
Subject:
Re: amd64 pmap & resident_count
To:
Martin Pieuchot <mpi@openbsd.org>
Cc:
tech@openbsd.org
Date:
Tue, 27 Aug 2024 22:43:51 +0200

Download raw body.

Thread
> Date: Fri, 23 Aug 2024 16:19:37 +0200
> From: Martin Pieuchot <mpi@openbsd.org>
> 
> On a 24 CPUs amd64 machine building a kernel with -j24 always result in
> a corrupted "RES" `pmap_kernel()->pm_stats.resident_count' view in top(1).
> 
> Modifications to this field are (mostly) serialized by the pmap's mutex.
> Sadly pmap_enter() do not grab the mutex for the kernel's pmap.  Diff
> below changes that and fix the issue for me.  I believe it is easier and
> cheaper than using atomics everywhere.
> 
> ok?

Not sure this is a good idea.  Not grabbing the pmap mutex for the
kernel pmap was somewhat by design; I think at some point I convinced
myself that doing so could lead to deadlocks.

So yes, the statistics for the kernel pmap may end up corrupted.  That
shouldn't be a problem though, since these statistics are pretty much
meaningless for the kernel pmap.  I don't even think our various
architectures do the accounting in a consistent way.

So maybe we should just not bother with these for the kernel pmap?
And rather than wrapping each of the pm_stats.resident_count and
pm_stats.wiredcount with a an

    if (pmap != pmap_kernel())

we should just change the few places where we actually use/report the
counts to report zero for the kernel map?

> 
> Index: arch/amd64/amd64/pmap.c
> ===================================================================
> RCS file: /cvs/src/sys/arch/amd64/amd64/pmap.c,v
> diff -u -p -r1.171 pmap.c
> --- arch/amd64/amd64/pmap.c	8 Aug 2024 15:57:22 -0000	1.171
> +++ arch/amd64/amd64/pmap.c	23 Aug 2024 14:13:46 -0000
> @@ -416,10 +416,6 @@ pmap_map_ptes(struct pmap *pmap)
>  
>  	KASSERT(pmap->pm_type != PMAP_TYPE_EPT);
>  
> -	/* the kernel's pmap is always accessible */
> -	if (pmap == pmap_kernel())
> -		return 0;
> -
>  	/*
>  	 * Lock the target map before switching to its page tables to
>  	 * guarantee other CPUs have finished changing the tables before
> @@ -427,6 +423,10 @@ pmap_map_ptes(struct pmap *pmap)
>  	 */
>  	mtx_enter(&pmap->pm_mtx);
>  
> +	/* the kernel's pmap is always accessible */
> +	if (pmap == pmap_kernel())
> +		return 0;
> +
>  	cr3 = rcr3();
>  	KASSERT((cr3 & CR3_PCID) == PCID_KERN ||
>  		(cr3 & CR3_PCID) == PCID_PROC);
> @@ -443,8 +443,7 @@ pmap_map_ptes(struct pmap *pmap)
>  void
>  pmap_unmap_ptes(struct pmap *pmap, paddr_t save_cr3)
>  {
> -	if (pmap != pmap_kernel())
> -		mtx_leave(&pmap->pm_mtx);
> +	mtx_leave(&pmap->pm_mtx);
>  
>  	if (save_cr3 != 0)
>  		lcr3(save_cr3);
> 
>