Mailing List Archive

On 10/02/25(Mon) 13:02, Claudio Jeker wrote: > On Mon, Feb 10, 2025 at 12:32:52PM +0100, Martin Pieuchot wrote: > > On 07/02/25(Fri) 17:47, Claudio Jeker wrote: > > > On Thu, Feb 06, 2025 at 06:46:52PM +0100, Martin Pieuchot wrote: > > > > On 06/02/25(Thu) 09:43, Theo de Raadt wrote: > > > > > Mark Kettenis <mark.kettenis@xs4all.nl> wrote: > > > > > > > > > > > > From: "Theo de Raadt" <deraadt@openbsd.org> > > > > > > > Date: Thu, 06 Feb 2025 09:17:52 -0700 > > > > > > > > > > > > > > > [2] in another thread, pledge("stdio rpath wpath"), and returns. > > > > > > > > the process is now pledged. > > > > > > > > > > > > > > How can this be allowed? > > > > > > > > > > > > > > I am pretty sure sys_pledge should single-thread the process, which > > > > > > > means it will wait until other threads complete their in-kernel sleeps. > > > > > > > > > > > > I'm not sure clauio@ will agree with you ;) > > > > > > > > > > He just agreed with me privately. > > > > > > > > I'd rather see a rwlock be used to serialized access to the per-process > > > > data structures. I don't see any reason to use the single thread API > > > > for this and I'd rather not spread its usage. It is already a pain to > > > > work with. > > > > > > We need to fix that and this is what I'm slowly working at. The single > > > thread API should be simple and allow a few syscalls to know that no other > > > thread will disturb them. exec and exit are such cases but I think unveil > > > and pledge fall into a similar category. They alter the behaviour of the > > > process and should therefor change the sate in a safe fashion. > > > > I disagree with this approach. > > I desagree with you on this one. The single thread API is one of the tools > we have to make MP/MT code safe. In this case this tool fits well and solves > an immediate problem. Hammering an rwlock around all of unveil is not > practicable on the other hand. The single thread API is IMHO not yet a tool. It is an extremely fragile subsystem which relies on a complex mutex & reference counting of threads plus internal state inspection that rely on the SCHED_LOCK() and that we do not want to spread. From performance point of view it is the worst and from far. It also interacts with signals and states that are (ab)used by ptrace. Plus it is a subsystem that is under heavy development with multiple bugs and locking dependencies that we haven't solved yet. On top of that it is very complicated and barely understood by very few over-busy kernel developers. Finally it has no integration in WITNESS nor way to assert that data access are properly "protected". > Hammering an rwlock around all of unveil is not > practicable on the other hand. This is completely false. unveil is a relatively small subsystems composed of 3 fields shared amount sibling threads in a process. It is not only practicable, it is doable starting with the write path unveil_add(). Plus the subsystem already use rwlock... I also believe this is a great place to start building knowledge about the limits of the VFS in term of data owners. I also believe the complexity of unveil is small enough that a kernel developer not experienced in locking can tackle it with some help. > Why can't we use all tools in the box? Why can we only use rwlocks and > mutexes for our solutions? Theo can do what he wants, it's his project. I'm just sharing my experience of 12 years working on parallelizing OpenBSD and what I believe is better for the project, its developers and users. > > > Since both pledge and unveil are almost never called it is better to > > > optimise for the readers (which happen on all or many syscalls). > > > > Indeed, a read/write lock is fine. > > No it is not. rwlocks are not a golden ticket that makes all pain go away. > They are massive bottle necks even now that the most obvious problem has > been fixed. They are no golden tickets, they are rwlocks. We know how to use them and clearly document which fields are protected by them. This allows us to work as a team re-using the knowledge of a previous developer and push the line of the lock a bit further. They are well integrated in WITNESS as well as in the %spin analysis tools. We know how to split contended lock or use alternate locking primitives when necessary.

2025-02-10 11:32 Martin Pieuchot:
patch: relax ni_pledge panic
- 2025-02-10 12:02 Claudio Jeker:
  patch: relax ni_pledge panic
- - 2025-02-17 14:40 Martin Pieuchot:
    patch: relax ni_pledge panic
  - - 2025-02-18 13:00 Claudio Jeker:
      patch: relax ni_pledge panic
    - - 2025-02-18 14:16 Mark Kettenis:
        patch: relax ni_pledge panic