From: Mateusz Guzik Subject: Re: [EXT] Re: Kernel protection fault in fill_kproc() To: Martin Pieuchot Cc: tech@openbsd.org Date: Sat, 13 Sep 2025 01:59:51 +0200 On Thu, Aug 14, 2025 at 10:59:16AM +0000, Martin Pieuchot wrote: To quote some other parts of the thread: > > > In my opinion the fill_kproc() code is wrong and it should not look at > > > pr->ps_vmspace if the PS_EXITING flag is set for the process. > > > > > > exit1() sets PS_EXITING flag early on and after that point the vm can be > > > purged so the vm_resident_count() is probably wrong anyway. > > I guess that is safe since exit1() still runs with the kernel lock > held.  Which means that the PS_EXITING flags can't be set while > fill_kproc() runs, since it holds the kernel lock. > I agree with the above sentiment, and with this in mind I disagree with the patch. Trying to catch all the ways fill_kproc() can race against process teardown is an avoidable exercise in misery and can be handled the same way it was in sysctl_file(): utilize a refcount which stalls process exit. I do have some criticism of the way it was implemented over there though On the exit side there is bunch of exiting-related flags being set, including PS_EXITING and there is: refcnt_finalize(&pr->ps_refcnt, "psdtor"); In sysctl_file(): KERNEL_LOCK(); [snip] if (pr->ps_flags & (PS_SYSTEM | PS_EMBRYO | PS_EXITING)) continue; refcnt_take(&pr->ps_refcnt); .... refcnt_rele_wake(&pr->ps_refcnt); I presume this works because proc exit also takes KERNEL_LOCK, but this is bound to cause trouble at some point as the lock is pushed down. Instead this relationship should be expicitly spelled out in relevant helpers, so that if someone does work in the area and forgets to cover it with new locking, they get a nice splat. pseudo-code wise, something like this: /* * One day this will be a per-process lock instead of kernel lock */ void proc_drain_holders(struct proc *pr) { ASSERT_KERNEL_LOCKED(); KASSERT(pr->ps_flags & PS_EXITING); refcnt_finalize(&pr->ps_refcnt, "psdtor"); } bool proc_tryhold(struct proc *pr) { ASSERT_KERNEL_LOCKED(); if (pr->ps_flags & PS_EXITING) return false; refcnt_take(&pr->ps_refcnt); return true; } bool proc_rele(struct proc *pr) { ASSERT_KERNEL_LOCKED(); refcnt_rele_wake(&pr->ps_refcnt); } Then sysctl_file() and fill_kproc() can do this, KERNEL_LOCK'ed or not: if (!proc_tryhold(pr)) { /* already exiting, pretent it's not there */ .... } /* found and secured against exit */ /* done with it */ proc_rele(pr); This in particular protects against the vmspace going down.