From: Martin Pieuchot Subject: Re: [EXT] Re: Kernel protection fault in fill_kproc() To: Mateusz Guzik Cc: tech@openbsd.org Date: Sat, 13 Sep 2025 10:58:22 +0200 On 13/09/25(Sat) 01:59, Mateusz Guzik wrote: > On Thu, Aug 14, 2025 at 10:59:16AM +0000, Martin Pieuchot wrote: > > To quote some other parts of the thread: > > > > > In my opinion the fill_kproc() code is wrong and it should not look at > > > > pr->ps_vmspace if the PS_EXITING flag is set for the process. > > > > > > > > exit1() sets PS_EXITING flag early on and after that point the vm can be > > > > purged so the vm_resident_count() is probably wrong anyway. > > > > I guess that is safe since exit1() still runs with the kernel lock > > held.  Which means that the PS_EXITING flags can't be set while > > fill_kproc() runs, since it holds the kernel lock. > > > > > I agree with the above sentiment, and with this in mind I disagree with > the patch. Let me rephrase: you disagree with the existing logic. The diff I sent follow this logic. > Trying to catch all the ways fill_kproc() can race against process > teardown is an avoidable exercise in misery and can be handled the same > way it was in sysctl_file(): utilize a refcount which stalls process > exit. Maybe. Please show a diff, I'll be delighted to review it. That said this is not trivial at all and my diff is a step in this direction. > I do have some criticism of the way it was implemented over there though Again, send a diff and maybe we might work together. I'm well aware of all you wrote, so what? > On the exit side there is bunch of exiting-related flags being set, > including PS_EXITING and there is: > refcnt_finalize(&pr->ps_refcnt, "psdtor"); > > In sysctl_file(): > KERNEL_LOCK(); > [snip] > if (pr->ps_flags & (PS_SYSTEM | PS_EMBRYO | PS_EXITING)) > continue; > > refcnt_take(&pr->ps_refcnt); > .... > refcnt_rele_wake(&pr->ps_refcnt); > > I presume this works because proc exit also takes KERNEL_LOCK, but this > is bound to cause trouble at some point as the lock is pushed down. Then push it down and you might fix the issues. > Instead this relationship should be expicitly spelled out in relevant > helpers, so that if someone does work in the area and forgets to cover > it with new locking, they get a nice splat. Then go ahead and spell it out. > pseudo-code wise, something like this: That wouldn't work for other reasons. Please use our time wisely and start by writing, testing and explaining a diff. > /* > * One day this will be a per-process lock instead of kernel lock > */ > void > proc_drain_holders(struct proc *pr) > { > ASSERT_KERNEL_LOCKED(); > KASSERT(pr->ps_flags & PS_EXITING); > refcnt_finalize(&pr->ps_refcnt, "psdtor"); > } > > bool > proc_tryhold(struct proc *pr) > { > ASSERT_KERNEL_LOCKED(); > if (pr->ps_flags & PS_EXITING) > return false; > refcnt_take(&pr->ps_refcnt); > return true; > } > > bool > proc_rele(struct proc *pr) > { > ASSERT_KERNEL_LOCKED(); > refcnt_rele_wake(&pr->ps_refcnt); > } > > Then sysctl_file() and fill_kproc() can do this, KERNEL_LOCK'ed or not: > > if (!proc_tryhold(pr)) { > /* already exiting, pretent it's not there */ > .... > } > > /* found and secured against exit */ > > /* done with it */ > proc_rele(pr); > > This in particular protects against the vmspace going down.