From: Jeremie Courreges-Anglas Subject: Re: One reaper per CPU To: Claudio Jeker , Christian Ludwig , tech@openbsd.org Date: Thu, 11 Jul 2024 12:54:48 +0200 On Thu, Jul 11, 2024 at 12:46:56PM +0200, Martin Pieuchot wrote: > On 10/07/24(Wed) 19:28, Jeremie Courreges-Anglas wrote: > > On Wed, Jul 10, 2024 at 05:51:26PM +0200, Claudio Jeker wrote: > > > On Wed, Jul 10, 2024 at 05:21:35PM +0200, Christian Ludwig wrote: > > > > Hi, > > > > > > > > This implements one reaper process per CPU. The benefit is that we can > > > > switch to a safe stack (the reaper) on process teardown without the need > > > > to lock data structures between sched_exit() and the local reaper. That > > > > means the global deadproc mutex goes away. And there is also no need to > > > > go through idle anymore, because we know that the local reaper is not > > > > currently running when the dying proc enlists itself. > > > > > > > > I have tested it lightly. It does not fall apart immediately. I do not > > > > see any performance regressions in my test case, which is recompiling > > > > the kernel over and over again, but your mileage my vary. So please give > > > > this some serious testing on your boxes. > > > > > > > I thought everyone agreed that we need less reaper and not more. > > > > FWIW Christian and I sit are sitting next to each other during this > > hackathon. We've been chatting about this since yesterday, and I like > > the per-cpu idea. I tried to implement it in 2022 but didn't get a > > working diff. > > > The per-cpu reaper was art's initial plan when the reaper was > > introduced in NetBSD. NetBSD since reverted their use of a reaper, > > but the idea still has nice properties: > > - no need for locks and wakeup() between the exiting process and its > > dedicated reaper > > - no need to garbage-collect exited threads like done on NetBSD or in > > my reaper diff, the percpu reaper can proc_free() them right away > > - bye bye exit2() > > - idle really is idle > > I'm not convinced. I though our goal was to get rid of the reaper to > make sure process account for their own cleanup. > > > > All the heavy teardown done in the reaper must be moved to exit1(). > > > > True. But nothing in the per-cpu reaper approach prevents us from > > moving uvm_exit() from the reaper to exit1(). > > Can we start with that then? What can't be moved in exit1()? That's my plan after mpi commits his reaper/uvm unlocking diff. > For what > do we need 80 new kernel threads on a 80 CPUs machine? > > > > After that I think having a single reaper thread is just fine (we could > > > even just use a task queue for cleaning up the final bits of a proc). > > > > If this per-cpu reaper proves stable, I suspect it would still allow > > for less contention (locks) and latency (wakeups) than a task. > > Those are guesses. I'd be more confident with analysis and numbers. Fair enough! Consensus in the hackroom is not to explore that option. -- jca