Index | Thread | Search

From:
Martin Pieuchot <mpi@grenadille.net>
Subject:
Re: Another attempt to get rid of the reaper
To:
tech@openbsd.org
Date:
Mon, 20 Oct 2025 15:54:43 +0200

Download raw body.

Thread
  • Christian Ludwig:

    Another attempt to get rid of the reaper

  • On 13/10/25(Mon) 16:48, Claudio Jeker wrote:
    > On Fri, Oct 10, 2025 at 12:10:56PM +0200, Martin Pieuchot wrote:
    > > On 08/10/25(Wed) 17:07, Claudio Jeker wrote:
    > > > On Thu, Sep 18, 2025 at 05:45:46PM +0200, Martin Pieuchot wrote:
    > > > [...]  
    > > > > On 16/09/25(Tue) 11:34, Claudio Jeker wrote:
    > > > > > On Sun, Sep 14, 2025 at 10:36:51PM +0200, Christian Ludwig wrote:
    > > > > > > Hi,
    > > > > > > 
    > > > > > > this is another attempt to get rid of the dedicated reaper thread.
    > > > > > 
    > > > > > Why is this a goal? What problem are you trying to solve with this?
    > > > > 
    > > > > I don't know what are Christian goals.  Here are the ones, I believe,
    > > > > we will all benefit from:
    > > > > 
    > > > > - The first goal is to ensure userland processes pay the price for their
    > > > > cleanup.  When this is not possible the best option is to make parents
    > > > > pay for their children.
    > > > 
    > > > I think by moving most into exit1() this goal is achieved. 
    > > 
    > > No it's not.  You're just not interested or not curious.
    > 
    > Don't acuse me of being not interested, not curious or blind. You should
    > know better, I shared a lot of profiling data with you.  I run a lot of
    > profiling and I'm probably one of the few people that looks at lltrace
    > output to better understand what is actually going on in detail.
    
    The point is not about sharing data is about discouraging people that
    improve things you don't care about.  You believe that's not worth
    improving, fine, don't involve yourself.
    
    > > You're mixing everything, I'm so tired.  This has nothing do to with
    > > UVM, rwlocks the SCHED_LOCK() or the KERNEL_LOCK().
    > 
    > No, it has very much todo with this. You want to reduce context switches
    > in the exit path. You think there are 2 extra switches there that hurt us
    > badly.
    
    Yes and I'm not going to work on something else because you disagree.
    
    > > I already told you that the problem with uobjlk is fixed by the
    > > parallel diff that cannot be enabled because of sparc64.  The problem is
    > > sparc64 which has a broken pmap and nobody care.  Theo doesn't care enough
    > > to send a bug report to bugs@.  Mark doesn't care to answer my emails
    > > and discredit my work.
    > > 
    > > I had to ask you after weeks to get a bug report.  Since then what
    > > happen?  Nothing.  The bug is there.  The release will ship with it.
    > > I even sent a workaround for sparc64 and nobody replied.  So yeah, great
    > > team work.
    > 
    > Yes, it is shitty team work. You are part of that team.
    > I have enough going on right now that has higher prio for me than hunting
    > down this bug. I already invested a lot of time into this and right now I
    > have no extra spare time to throw at this issue. I know this sucks.
    
    Then please stop bullshitting me about other issues that have nothing to
    do with this thread.
    
    > > The problem is not uobjlk, the problem is that nobody care and I'm tired
    > > of arguing with people that do not care and do not want improvements.
    > > 
    > > We can't enable parallel faults on sparc64 because its pmap is broken.
    > > That's not new, it has been there since almost a year.
    > > 
    > > This has nothing to do with the reaper.  Nor the SCHED_LOCK.  Please
    > > stop mixing stuff.
    > 
    > I only said that if you want to make a difference in the number of context
    > switches that you should look at rwlocks and esp. those that are very
    > busy. Since those generate a maginitude more context switches then those
    > in the exit path.
    
    No that is not what I am after.  You clearly misunderstood what I meant.
    
    > > > Not sure which contention you are after but taking away 1 or 2 context
    > > > switches at process exit will not move the needle.
    > > 
    > > Sure, computer are complicated let's go shopping.
    > > 
    > > > > - Another goal is to see the cost of ripping processes to help us pick the
    > > > > right algorithms and not only measure the parts we are interested in. 
    > > > 
    > > > I think right now the cost is more visible then adding it into the wait
    > > > system call. Ideally the cost of uarea free should be minimal (which is
    > > > not true right now).
    > > 
    > > It is obvious the cost is not visible enough.  Your argument proves it.
    > 
    > I know very well how much CPU time the reaper eats. It is clearly visible
    > but it seems it is better to hide this information into process like
    > init(8).
    
    Current OpenBSD hides it.  You don't see it with top(1) unless you put -S.
      
    > > > > > In my opinion this diff makes the current exit situation worse. Instead of
    > > > > > having a clear reaper process that does the cleanup of the proc and
    > > > > > process we now end up delegating this work to init(8) or the parent
    > > > > > process. Neither are really ideal to do this work.
    > > > > 
    > > > > Please note that the parent process is already doing the cleanup via
    > > > > process_zap().  This is what we want.  We want all the work not done 
    > > > > in exit1() to be done in the parent.
    > > > 
    > > > Do we really want that? Do we want to have zombies with full uarea etc
    > > > sitting around?
    > > 
    > > Then clean the uarea in exit.  This is a no brainer.
    > 
    > Is it a no brainer? How do rip out the stack your running on without
    > crashing?
    
    Next thread on CPU.  Same trick as spc_deadcond in mi_switch() in the
    diff.
    
    > > > Agreed, I already mentioned multiple times that this is the important but
    > > > and it should be moved into exit1() to reduce latency to a minimum.
    > > > 
    > > > > - it requires extra context switches which add extra contention and latency
    > > > 
    > > > I very much doubt this. If the signaling is done in exit1() most will
    > > > work.
    > > 
    > > How can you doubt a fact?  Just go read the code.
    > 
    > You think it is a fact, but it is not. The numbers show it clearly that
    > you are running after a ghost.
    
    No I'm not.  You're interpreting my words in your way to twist what I
    said.  There are two extra context switches in exit1().  This is what I
    want to get rid of.
    
    > > > > - all of that prevents tools like dpb(1) which gather data about
    > > > >   processes execution to do a better job
    > > >  
    > > > dpb(1) is primarily file system bound. It can not scale because it hammers
    > > > the disks and VFS so hard that everything spins.
    > > 
    > > Great nihilistic argument to not do anything.
    > 
    > No, my argument is that if you want to improve dpb build time then you
    > would look into VFS.
    
    That's not my point.  Here again you're twisting my words.  I don't want
    to look into the VFS.
    
    > > >                               As I said, lets move the signaling of the
    > > > parent to exit1() as a first step. Don't over do it.
    > > 
    > > Once the uarea is freed and the signaling is done there's nothing left
    > > in the reaper but the overhead of context switching.
    > 
    > In my book the reaper would just reduce the process to its minimal shell
    > which is the disposal of the uarea and vmspace. It does that in a rather
    > simple and clean way.
    
    I understood and I disagree.  
    
    > If those things are moved up then sure we don't need a reaper. I see no
    > real way to clean out the vmspace in exit1() since we need that space to
    > run at that point. At least I did not see an simple way that is not a
    > total hack.
    
    We won't need a reaper.
    
    
    
  • Christian Ludwig:

    Another attempt to get rid of the reaper