Index | Thread | Search

From:
Claudio Jeker <cjeker@diehard.n-r-g.com>
Subject:
Re: UVM amap & vmmap rbtree
To:
tech@openbsd.org
Date:
Thu, 24 Jul 2025 15:28:35 +0200

Download raw body.

Thread
On Tue, Jul 22, 2025 at 04:29:18PM +0200, Martin Pieuchot wrote:
> The use of a rb-tree in the current vmmap implementation is responsible
> for the poor page fault performances we're all experiencing.
> 
> To reduce wasted memory in UVM large anonymous segments are chunked into
> 64k ones.  For example the main firefox process, after starting it, on
> this laptop has the following:
> 
> $ doas procmap -p 9434|wc -l              
>     6849
> $ doas procmap -p 9434|grep " 64k"|wc -l
>     457
> 
> Chunking a segment via UVM_MAP_CLIP_START/END was O(1) before using a
> rbtree and now it requires rebalancing the tree.  This is not only bad
> for single-threaded performances, it completely kills multi-threaded
> performances because this operation requires an exclusive lock.

Why do we clip entries? I thought this is only needed to create submaps
when things like mlock() or madjust() are called on a part of a larger
map.
 
> Diff below allows multi-threaded process to perform CoW in parallel.
> Sadly it doesn't gain much because chunking a segment needs an exclusive
> lock.
> 
> Clearly segment chunking is not compatible with the use of a rbtree to
> store addresses.  Getting rid of chunking would mean follow the Solaris
> way and allocate a page-table for every segment or follow the
> Linux/FreeBSD way and go back to a tree of anonymous objects.  Do you
> see another option?

I'm certain the there is a datastructure that works well here. To me this
looks like something a btree could be good at.
 
> On the other hand why did we pick a rbtree for vmmap?  Is this really
> needed?  I really doubt it makes sense to keep this data structure with
> memory spaces that become bigger and bigger.
> 
> Comments?  Ideas?  Inputs?
 
From my understanding the rbtree was added because random mmap caused an
explosion of entries in the vmmap. In original UVM the vmmap is a sorted
linear list and to improve lookup speed a single cache entry was used
The idea was that you fault around the same address again. This only works
if the number of map entries is low. Our processes can have many many
thousend map entries so a linear search really does not cut it.
Also the map hint cache is not effective with aggressivly randomized
address space and multi-threaded applications.

In general there are other data structures to hold such information. In
the end the vm_map_entry is all about start and end address of a range.

-- 
:wq Claudio