From: Martin Pieuchot <mpi@openbsd.org>
Subject: Re: powersave CPU policy
To: tech@openbsd.org
Date: Wed, 12 Jun 2024 12:34:17 +0200

On 03/06/24(Mon) 21:28, Kirill A. Korinsky wrote:
> On Mon, 03 Jun 2024 17:44:27 +0100,
> Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
> > 
> > So my take on this is: not now!  I'm in the same area to improve the
> > suspend-to-idle code that was recently committed.
> > 
> > Also, the diff does many things that should probably be separate
> > changes.  Also, this may work for your machine, and for your use case,
> > but it needs to work for everybody and across architectures.
> >
> > And while the idea of turning off cores that aren't used, in the end
> > this depends on the scheduler to give you the right information.  And
> > there are plans to make some drastic changes in that area as well.
> >

If I understand your diff correctly it is about dynamically removing a
bunch of CPUs from the scheduler in order to place them in the deepest
sleep state?

Is your work based on some previous state of the art?  How did you chose
the number in your algorithm?

I have a related question.  acpicpu_idle() currently checks if a CPU is
idle via cpu_is_idle().  If the CPU is removed from the scheduler, this
will always be true.  I'm annoyed with these checks because they rely on
distributing runnable threads to CPUs in advance.  Do you have an idea
on how to remove those checks from the various *idle() routines without
compromising reactivity?
 
> I totaly agreed that this can be split into pices. The first and simplest
> part is refactoring to extract sched_start_cpu / sched_stop_cpu, the next
> and one and which brings some question I assume is parallel selection of
> C-state for halt CPU.

This can have interest on its own.  I believe the need_resched() call is
not currently needed, it is just there to speed up moving the CPU to
idle, right?

> I am on this list with hope to get some feedback regarding this work, and
> make it cleaner and better at the end, becuase at least for my use case it
> improves expirences and probably can be used by other.

Regarding your algorithm, I see that you ensure that a CPU is always
online.  Since CPU0 is the one handling (most) of the interrupts,
shouldn't it be special?  And what about interrupt distributed to other
CPUs?  Can we check for that?  Does it matter?

I see that you use CP_IDLE to determine if a CPU should be removed from
the scheduler.  Is it by choice or because that's the only measurement
available?  Isn't a tick too big?

I'm worried about the use of SPCF_HALTED outside of the scheduler code.
What about SMT threads that are not longer part of the scheduler?  Is
there any side effect of placing them in the deepest C state?  Does that
make any sense?