Index | Thread | Search

From:
"Theo de Raadt" <deraadt@openbsd.org>
Subject:
Re: SYS_pinsyscalls question
Cc:
Mateusz Guzik <mjguzik@gmail.com>, Marc Espie <marc.espie.openbsd@gmail.com>, tech@openbsd.org
Date:
Mon, 24 Feb 2025 08:31:02 -0700

Download raw body.

Thread
  • Theo de Raadt:

    SYS_pinsyscalls question

  • Mark Kettenis:

    SYS_pinsyscalls question

  • Theo de Raadt <deraadt@openbsd.org> wrote:
    
    > Mateusz Guzik <mjguzik@gmail.com> wrote:
    > 
    > > On Sun, Feb 23, 2025 at 5:21 PM Theo de Raadt <deraadt@openbsd.org> wrote:
    > > >
    > > > > Then this copyin:
    > > > >           error = copyin(SCARG(uap, pins), pins, npins * sizeof(u_int));
    > > > >           if (error)
    > > > >                   goto err;
    > > > >
    > > > > will go off cpu and drop the kernel lock, allowing other threads to
    > > > > reach the same point, all still observing pr->ps_libcpin.pn_start == 0.
    > > > >
    > > > > Then several threads will reach this assignment:
    > > > >           pr->ps_libcpin.pn_start = base;
    > > > >
    > > > > is going to override all previously set bufs, leaking them.
    > > >
    > > > Yes, would need very custom crt0 in those static binaries.
    > > >
    > > 
    > > Presumably someone wanting to mess with the kernel would go there.
    > > 
    > > > This diff should allow the first thread to do pinsyscalls to win, and
    > > > re-check the condition post-sleep so that others lose and free their
    > > > pintable.  EPERM matches the existing spec in the man page.
    > > >
    > > 
    > > Another angle is what happens when one thread is messing about with
    > > installing the table, while another is issuing syscalls. Note there is
    > > no synchronization between this and pin_check. It may be this does not
    > > blow up in practice at the moment, but I don't think depending on it
    > > is viable.
    > 
    > That is not a concern.  system calls which start before pinsyscalls
    > completes setup, work as expected.
    
    To be more clear, this strategy was recently taken with pledge:
    
        CVSROOT:        /cvs
        Module name:    src
        Changes by:     deraadt@cvs.openbsd.org 2025/02/10 09:45:46
        
        Modified files:
                sys/sys        : proc.h
                sys/kern       : kern_event.c kern_pledge.c
                sys/dev/vmm    : vmm.c
        
        Log message:
        A syzkaller report was diagnosed by semarie, and found a namei-related
        sleeping system call which was re-inspecting p->p_p->ps_pledge in one thread,
        after another thread had reduced the promises by calling pledge(), with
        promises which would have prevented that syscall from being called in
        the first place.  This inconsistant promise view is dangerous.  So let's
        change pledge semantics a tiny bit:  We copy the per-process p_p->ps_pledge
        value to per-thread p_pledge at invocation of each system call, so that the
        configuration is stable.
        This method avoids increasing the cost of pledge checks.
        ok claudio kettenis semarie
        
    Basically, sys_pledge() sets process flags very late.  A system call
    performed in a different process before that completion, is operating
    under the old rules, and allowed to enter the kernel and will then run
    to completion.
    
    It should be the same with pinsyscalls.  Any system call entry attempted
    by a different thread before pinsyscalls() completes, will succeed.
    
    That's fine.  There is no interlock needed, because we are not trying
    to solve time travel.
    
    
  • Theo de Raadt:

    SYS_pinsyscalls question

  • Mark Kettenis:

    SYS_pinsyscalls question