Download raw body.
[PATCH] convert mpl ticket lock to anderson's lock
On 17/02/26(Tue) 13:27, Theo de Raadt wrote:
> Claudio Jeker <cjeker@diehard.n-r-g.com> wrote:
>
> > On Tue, Feb 17, 2026 at 12:25:16PM -0700, Theo de Raadt wrote:
> > > Theo de Raadt <deraadt@openbsd.org> wrote:
> > >
> > > > > Per that post, the primary problem concerns page allocation and the
> > > > > way mutexes are implemented
> > > >
> > > > While looking at things in the pagedaemon, I've become concerned
> > > > about how uvm_lock_pageq() is used
> > > >
> > > > Look at how uvm_pageactivate() and uvm_pagedeactivate() must be called
> > > > unlocked
> > > >
> > > > So locked-callers unlock temporarily to call them, resulting in
> > > > very strange unlock/lock/tiny-operation/unlock/lock sequences.
> > > >
> > > > Without the locks the tiny-operation would be instantenous but this
> > > > design means there is a tremendous bubble if someone else wanted the lock.
> > > >
> > > > I think these low-level changes in locking and mutexes will have little
> > > > effect if we are using them extremely poorly, which we are.
> > >
> > > This is why.
> > >
> > > revision 1.184
> > > date: 2025/12/10 08:38:18; author: mpi; state: Exp; lines: +44 -37; commitid: ACX7PGcPcpfRHo20;
> > > Push `pageqlock' dances inside uvm_page{de,}activate() & uvm_pagewire().
> > >
> > > Tested during multiple bulks on amd64, i386, arm64 and sparc64 by jca@,
> > > phessler@ and sthen@.
> > >
> > >
> > >
> > > This pushing of the lock into the lower level functions, in turn requires that
> > > many higher level callers have to unlock, in order to a function which now locks,
> > > and upon return, relock for their own needs.
> > >
> > > I don't think think that reduces contention but only hides it. But it
> > > definately increases latency for the specific caller because someone else
> > > wins the mutex during the two release windows. Parallelism may have increased,
> > > but latency of critical functions calling these will suffer. Some of these
> > > critical functions if they did't do this dance, would be finished quicker and
> > > the system would be less contended.
> > >
> > > To me this looks like deck chairs succesfully rearranged.
> > >
> >
> > I actually think this diff introduced some locking issues where objects
> > get modified in multiple steps instead of atomically. If two cpus run in
> > lock step the result could be a corrupt object.
>
> I agree.>
> uvm_unlock_pageq() is explicitly defined to just lock the queue.
> However, in many cases the calling code also implicitly uses it to
> provide atomicity for some operations upon the page being handled,
> because all similar atomicity requiring sequences also occur inside
> uvm_unlock_pageq()
>
> But when this lock is divided into 2-3 lock/unlock sequences, this
> introduces the ability for a shared mapping on the same page or object
> in a different context to interact upon the page in ways which were
> previously blocked. Those code sequences only occur inside
> uvm_unlock_pageq() locked regions, which is why it implicitly survives.
All context first try to grab the owner lock of a page then check for
the BUSY flag. That is what prevent the fault handler or any other part
of the VM subsystem to mess with the page.
> This appears to be the design of the uvm layer, and without
> restructuring to avoid these bugs first, lock movement of this type is
> extremely dangerous.
I agree. My pending diff are bug fixes. Differing them would not
help us.
[PATCH] convert mpl ticket lock to anderson's lock