Index | Thread | Search

From:
Martin Pieuchot <mpi@grenadille.net>
Subject:
Re: patch: relax ni_pledge panic
To:
Theo de Raadt <deraadt@openbsd.org>, Mark Kettenis <mark.kettenis@xs4all.nl>, semarie@kapouay.eu.org, tech@openbsd.org
Date:
Mon, 17 Feb 2025 15:40:28 +0100

Download raw body.

Thread
On 10/02/25(Mon) 13:02, Claudio Jeker wrote:
> On Mon, Feb 10, 2025 at 12:32:52PM +0100, Martin Pieuchot wrote:
> > On 07/02/25(Fri) 17:47, Claudio Jeker wrote:
> > > On Thu, Feb 06, 2025 at 06:46:52PM +0100, Martin Pieuchot wrote:
> > > > On 06/02/25(Thu) 09:43, Theo de Raadt wrote:
> > > > > Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
> > > > > 
> > > > > > > From: "Theo de Raadt" <deraadt@openbsd.org>
> > > > > > > Date: Thu, 06 Feb 2025 09:17:52 -0700
> > > > > > > 
> > > > > > > > [2] in another thread, pledge("stdio rpath wpath"), and returns.
> > > > > > > >    the process is now pledged.
> > > > > > > 
> > > > > > > How can this be allowed?
> > > > > > > 
> > > > > > > I am pretty sure sys_pledge should single-thread the process, which
> > > > > > > means it will wait until other threads complete their in-kernel sleeps.
> > > > > > 
> > > > > > I'm not sure clauio@ will agree with you ;)
> > > > > 
> > > > > He just agreed with me privately.
> > > > 
> > > > I'd rather see a rwlock be used to serialized access to the per-process
> > > > data structures.  I don't see any reason to use the single thread API
> > > > for this and I'd rather not spread its usage.  It is already a pain to
> > > > work with.
> > > 
> > > We need to fix that and this is what I'm slowly working at. The single
> > > thread API should be simple and allow a few syscalls to know that no other
> > > thread will disturb them. exec and exit are such cases but I think unveil
> > > and pledge fall into a similar category. They alter the behaviour of the
> > > process and should therefor change the sate in a safe fashion.
> > 
> > I disagree with this approach.
> 
> I desagree with you on this one. The single thread API is one of the tools
> we have to make MP/MT code safe. In this case this tool fits well and solves
> an immediate problem. Hammering an rwlock around all of unveil is not
> practicable on the other hand.

The single thread API is IMHO not yet a tool.  It is an extremely fragile
subsystem which relies on a complex mutex & reference counting of
threads plus internal state inspection that rely on the SCHED_LOCK() and
that we do not want to spread.

From performance point of view it is the worst and from far.

It also interacts with signals and states that are (ab)used by ptrace.  Plus
it is a subsystem that is under heavy development with multiple bugs and
locking dependencies that we haven't solved yet.  On top of that it is very
complicated and barely understood by very few over-busy kernel developers.

Finally it has no integration in WITNESS nor way to assert that data access
are properly "protected".

>                       Hammering an rwlock around all of unveil is not
> practicable on the other hand.

This is completely false.  unveil is a relatively small subsystems
composed of 3 fields shared amount sibling threads in a process.  It is
not only practicable, it is doable starting with the write path
unveil_add().

Plus the subsystem already use rwlock...

I also believe this is a great place to start building knowledge about
the limits of the VFS in term of data owners.

I also believe the complexity of unveil is small enough that a kernel
developer not experienced in locking can tackle it with some help.

> Why can't we use all tools in the box? Why can we only use rwlocks and
> mutexes for our solutions?

Theo can do what he wants, it's his project.  I'm just sharing my
experience of 12 years working on parallelizing OpenBSD and what I
believe is better for the project, its developers and users.

> > > Since both pledge and unveil are almost never called it is better to
> > > optimise for the readers (which happen on all or many syscalls).
> > 
> > Indeed, a read/write lock is fine.
> 
> No it is not. rwlocks are not a golden ticket that makes all pain go away.
> They are massive bottle necks even now that the most obvious problem has
> been fixed.

They are no golden tickets, they are rwlocks.  We know how to use them
and clearly document which fields are protected by them.  This allows us
to work as a team re-using the knowledge of a previous developer and
push the line of the lock a bit further.  They are well integrated in
WITNESS as well as in the %spin analysis tools.  We know how to split
contended lock or use alternate locking primitives when necessary.