Index | Thread | Search

From:
Martin Pieuchot <mpi@grenadille.net>
Subject:
Re: [EXT] Re: Kernel protection fault in fill_kproc()
To:
Mateusz Guzik <mjguzik@gmail.com>
Cc:
tech@openbsd.org
Date:
Sat, 13 Sep 2025 10:58:22 +0200

Download raw body.

Thread
  • Philip Guenther:

    Kernel protection fault in fill_kproc()

  • On 13/09/25(Sat) 01:59, Mateusz Guzik wrote:
    > On Thu, Aug 14, 2025 at 10:59:16AM +0000, Martin Pieuchot wrote:
    > 
    > To quote some other parts of the thread:
    > 
    > > > > In my opinion the fill_kproc() code is wrong and it should not look at
    > > > > pr->ps_vmspace if the PS_EXITING flag is set for the process.
    > > > > 
    > > > > exit1() sets PS_EXITING flag early on and after that point the vm can be
    > > > > purged so the vm_resident_count() is probably wrong anyway.
    > > 
    > > I guess that is safe since exit1() still runs with the kernel lock
    > > held.  Which means that the PS_EXITING flags can't be set while
    > > fill_kproc() runs, since it holds the kernel lock.
    > > 
    > 
    > 
    > I agree with the above sentiment, and with this in mind I disagree with
    > the patch.
    
    Let me rephrase:  you disagree with the existing logic.  The diff I sent
    follow this logic.
    
    > Trying to catch all the ways fill_kproc() can race against process
    > teardown is an avoidable exercise in misery and can be handled the same
    > way it was in sysctl_file(): utilize a refcount which stalls process
    > exit.
    
    Maybe.  Please show a diff, I'll be delighted to review it.  That said
    this is not trivial at all and my diff is a step in this direction.
    
    > I do have some criticism of the way it was implemented over there though
    
    Again, send a diff and maybe we might work together.  I'm well aware of
    all you wrote, so what?
    
    > On the exit side there is bunch of exiting-related flags being set,
    > including PS_EXITING and there is:
    > 	refcnt_finalize(&pr->ps_refcnt, "psdtor");
    > 
    > In sysctl_file():
    > 	KERNEL_LOCK();
    > 	[snip]
    > 	if (pr->ps_flags & (PS_SYSTEM | PS_EMBRYO | PS_EXITING))
    > 		continue;
    > 
    > 	refcnt_take(&pr->ps_refcnt);
    > 	....
    > 	refcnt_rele_wake(&pr->ps_refcnt);
    > 
    > I presume this works because proc exit also takes KERNEL_LOCK, but this
    > is bound to cause trouble at some point as the lock is pushed down.
    
    Then push it down and you might fix the issues.
    
    > Instead this relationship should be expicitly spelled out in relevant
    > helpers, so that if someone does work in the area and forgets to cover
    > it with new locking, they get a nice splat.
    
    Then go ahead and spell it out.
    
    > pseudo-code wise, something like this:
    
    That wouldn't work for other reasons.  Please use our time wisely and
    start by writing, testing and explaining a diff.
    
    > /*
    >  * One day this will be a per-process lock instead of kernel lock
    >  */
    > void
    > proc_drain_holders(struct proc *pr)
    > {
    > 	ASSERT_KERNEL_LOCKED();
    > 	KASSERT(pr->ps_flags & PS_EXITING);
    > 	refcnt_finalize(&pr->ps_refcnt, "psdtor");
    > }
    > 
    > bool
    > proc_tryhold(struct proc *pr)
    > {
    > 	ASSERT_KERNEL_LOCKED();
    > 	if (pr->ps_flags & PS_EXITING)
    > 		return false;
    > 	refcnt_take(&pr->ps_refcnt);
    > 	return true;
    > }
    > 
    > bool
    > proc_rele(struct proc *pr)
    > {
    > 	ASSERT_KERNEL_LOCKED();
    > 	refcnt_rele_wake(&pr->ps_refcnt);
    > }
    > 
    > Then sysctl_file() and fill_kproc() can do this, KERNEL_LOCK'ed or not:
    > 
    > 	if (!proc_tryhold(pr)) {
    > 		/* already exiting, pretent it's not there */
    > 		....
    > 	}
    > 
    > 	/* found and secured against exit */
    > 
    > 	/* done with it */
    > 	proc_rele(pr);
    > 
    > This in particular protects against the vmspace going down.
    
    
    
  • Philip Guenther:

    Kernel protection fault in fill_kproc()