From: Martin Pieuchot Subject: Re: One reaper per CPU To: Claudio Jeker , Christian Ludwig , tech@openbsd.org Date: Thu, 11 Jul 2024 12:46:56 +0200 On 10/07/24(Wed) 19:28, Jeremie Courreges-Anglas wrote: > On Wed, Jul 10, 2024 at 05:51:26PM +0200, Claudio Jeker wrote: > > On Wed, Jul 10, 2024 at 05:21:35PM +0200, Christian Ludwig wrote: > > > Hi, > > > > > > This implements one reaper process per CPU. The benefit is that we can > > > switch to a safe stack (the reaper) on process teardown without the need > > > to lock data structures between sched_exit() and the local reaper. That > > > means the global deadproc mutex goes away. And there is also no need to > > > go through idle anymore, because we know that the local reaper is not > > > currently running when the dying proc enlists itself. > > > > > > I have tested it lightly. It does not fall apart immediately. I do not > > > see any performance regressions in my test case, which is recompiling > > > the kernel over and over again, but your mileage my vary. So please give > > > this some serious testing on your boxes. > > > > > I thought everyone agreed that we need less reaper and not more. > > FWIW Christian and I sit are sitting next to each other during this > hackathon. We've been chatting about this since yesterday, and I like > the per-cpu idea. I tried to implement it in 2022 but didn't get a > working diff. > The per-cpu reaper was art's initial plan when the reaper was > introduced in NetBSD. NetBSD since reverted their use of a reaper, > but the idea still has nice properties: > - no need for locks and wakeup() between the exiting process and its > dedicated reaper > - no need to garbage-collect exited threads like done on NetBSD or in > my reaper diff, the percpu reaper can proc_free() them right away > - bye bye exit2() > - idle really is idle I'm not convinced. I though our goal was to get rid of the reaper to make sure process account for their own cleanup. > > All the heavy teardown done in the reaper must be moved to exit1(). > > True. But nothing in the per-cpu reaper approach prevents us from > moving uvm_exit() from the reaper to exit1(). Can we start with that then? What can't be moved in exit1()? For what do we need 80 new kernel threads on a 80 CPUs machine? > > After that I think having a single reaper thread is just fine (we could > > even just use a task queue for cleaning up the final bits of a proc). > > If this per-cpu reaper proves stable, I suspect it would still allow > for less contention (locks) and latency (wakeups) than a task. Those are guesses. I'd be more confident with analysis and numbers.