Download raw body.
producer/consumer locking
this provides coordination between things producing and consuming
data when you don't want to block or delay the thing producing data,
but it's ok for make the consumer do more work to compensate for that.
the mechanism is a generalisation of the coordination used in the mp
counters api and the some of the process accounting code. data updated
by a producer is versioned, and the consumer reads the version on each
side of the critical section to see if it's been updated. if the
producer has updated the version, then the consumer has to retry.
the diff includes the migration of the process accounting to the
generalised api, and adds it to the cpu state counters on each cpu.
it's now possible to get a consistent snapshot of the cpu counters, even
if they were preempted by statclock.
i also have a pf diff that uses these. it allows pf to maintain counters
that userland can read without blocking the execution of pf. handy.
the only thing im worried about is the use of the alias function
attributes for !MULTIPROCESSOR kernels, but we use those in libc on all
our archs/compilers and it seems to be fine.
the manpage looks like this:
PC_LOCK_INIT(9) Kernel Developer's Manual PC_LOCK_INIT(9)
NAME
pc_lock_init, pc_cons_enter, pc_cons_leave, pc_sprod_enter,
pc_sprod_leave, pc_mprod_enter, pc_mprod_leave, PC_LOCK_INITIALIZER -
producer/consumer locks
SYNOPSIS
#include <sys/pclock.h>
void
pc_lock_init(struct pc_lock *pcl);
void
pc_cons_enter(struct pc_lock *pcl, unsigned int *genp);
int
pc_cons_leave(struct pc_lock *pcl, unsigned int *genp);
unsigned int
pc_sprod_enter(struct pc_lock *pcl);
void
pc_sprod_leave(struct pc_lock *pcl, unsigned int gen);
unsigned int
pc_mprod_enter(struct pc_lock *pcl);
void
pc_mprod_leave(struct pc_lock *pcl, unsigned int gen);
PC_LOCK_INITIALIZER();
DESCRIPTION
The producer/consumer lock functions provide mechanisms for a consumer to
read data without blocking or delaying another CPU or an interrupt when
it is updating or producing data. A variant of the producer locking
functions provides mutual exclusion between concurrent producers.
This is implemented by having producers version the protected data with a
generation number. Consumers of the data compare the generation number
at the start of the critical section to the generation number at the end,
and must retry reading the data if the generation number has changed.
The pc_lock_init() function is used to initialise the producer/consumer
lock pointed to by pcl.
A producer/consumer lock declaration may be initialised with the
PC_LOCK_INITIALIZER() macro.
Consumer API
pc_cons_enter() reads the current generation number from pcl and stores
it in the memory provided by the caller via genp.
pc_cons_leave() compares the generation number in pcl with the value
stored in genp by pc_cons_enter() at the start of the critical section,
and returns whether the reads within the critical section need to be
retried because the data has been updated by the producer.
Single Producer API
The single producer API is optimised for updating data from code
pc_sprod_enter() marks the beginning of a single producer critical
section for the pcl producer/consumer lock.
pc_sprod_leave() marks the end of a single producer critical section for
the pcl producer/consumer lock. The gen argument must be the value
returned from the preceding pc_sprod_enter() call.
Multiple Producer API
The multiple producer API provides mutual exclusion between multiple CPUs
entering the critical section concurrently. Unlike mtx_enter(9), the
multiple producer does not prevent preemption by interrupts, it only
provides mutual exclusion between CPUs. If protection from preemption is
required, splraise(9) can be used to protect the producer critical
section.
pc_mprod_enter() marks the beginning of a single producer critical
section for the pcl producer/consumer lock.
pc_mprod_leave() marks the end of a single producer critical section for
the pcl producer/consumer lock. The gen argument must be the value
returned from the preceding pc_mprod_enter() call.
On uniprocessor kernels the multiple producer API is aliased to the
single producer API.
CONTEXT
pc_lock_init(), pc_cons_enter(), pc_cons_leave(), pc_sprod_enter(),
pc_sprod_leave(), pc_mprod_enter(), pc_mprod_leave(), can be called
during autoconf, from process context, or from interrupt context.
pc_sprod_enter(), pc_sprod_leave(), pc_mprod_enter(), and
pc_mprod_leave() may run concurrently with (ie, on another CPU to) or
preempt (ie, run at a higher interrupt level) than pc_cons_enter() and
pc_cons_leave().
pc_sprod_enter(), pc_sprod_leave(), pc_mprod_enter(), and
pc_mprod_leave() must not be preempted or interrupted by the producer or
consumer API for the same lock.
RETURN VALUES
pc_cons_leave() returns 0 if the critical section did not overlap with an
update from a producer, or non-zero if the critical section must be
retried.
EXAMPLES
To produce or update data:
struct pc_lock pc = PC_LOCK_INITIALIZER();
void
producer(void)
{
unsigned int gen;
gen = pc_sprod_enter(&pc);
/* update data */
pc_sprod_leave(&pc, gen);
}
A consistent read of the data from a consumer:
void
consumer(void)
{
unsigned int gen;
pc_cons_enter(&pc, &gen);
do {
/* read data */
} while (pc_cons_leave(&pc, &gen) != 0);
}
SEE ALSO
mutex(9), splraise(9)
HISTORY
The pc_lock_init functions first appeared in OpenBSD 7.8.
AUTHORS
The pc_lock_init functions were written by David Gwynne
<dlg@openbsd.org>.
CAVEATS
Updates must be produced infrequently enough to allow time for consumers
to be able to get a consistent read without looping too often.
Because consuming the data may loop when retrying, care must be taken to
avoid side effects from reading the data multiple times, eg, when
accumulating values.
ok?
Index: share/man/man9/Makefile
===================================================================
RCS file: /cvs/src/share/man/man9/Makefile,v
diff -u -p -r1.310 Makefile
--- share/man/man9/Makefile 24 Feb 2024 16:21:32 -0000 1.310
+++ share/man/man9/Makefile 4 May 2025 07:18:11 -0000
@@ -29,7 +29,8 @@ MAN= aml_evalnode.9 atomic_add_int.9 ato
malloc.9 membar_sync.9 memcmp.9 mbuf.9 mbuf_tags.9 md5.9 mi_switch.9 \
microtime.9 ml_init.9 mq_init.9 mutex.9 \
namei.9 \
- panic.9 pci_conf_read.9 pci_mapreg_map.9 pci_intr_map.9 physio.9 \
+ panic.9 pci_conf_read.9 pci_mapreg_map.9 pci_intr_map.9 \
+ pc_lock_init.9 physio.9 \
pmap.9 pool.9 pool_cache_init.9 ppsratecheck.9 printf.9 psignal.9 \
RBT_INIT.9 \
radio.9 arc4random.9 rasops.9 ratecheck.9 refcnt_init.9 resettodr.9 \
Index: share/man/man9/pc_lock_init.9
===================================================================
RCS file: share/man/man9/pc_lock_init.9
diff -N share/man/man9/pc_lock_init.9
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ share/man/man9/pc_lock_init.9 4 May 2025 07:18:11 -0000
@@ -0,0 +1,212 @@
+.\" $OpenBSD$
+.\"
+.\" Copyright (c) 2025 David Gwynne <dlg@openbsd.org>
+.\" All rights reserved.
+.\"
+.\" Permission to use, copy, modify, and distribute this software for any
+.\" purpose with or without fee is hereby granted, provided that the above
+.\" copyright notice and this permission notice appear in all copies.
+.\"
+.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+.\"
+.Dd $Mdocdate: November 4 2019 $
+.Dt PC_LOCK_INIT 9
+.Os
+.Sh NAME
+.Nm pc_lock_init ,
+.Nm pc_cons_enter ,
+.Nm pc_cons_leave ,
+.Nm pc_sprod_enter ,
+.Nm pc_sprod_leave ,
+.Nm pc_mprod_enter ,
+.Nm pc_mprod_leave ,
+.Nm PC_LOCK_INITIALIZER
+.Nd producer/consumer locks
+.Sh SYNOPSIS
+.In sys/pclock.h
+.Ft void
+.Fn pc_lock_init "struct pc_lock *pcl"
+.Ft void
+.Fn pc_cons_enter "struct pc_lock *pcl" "unsigned int *genp"
+.Ft int
+.Fn pc_cons_leave "struct pc_lock *pcl" "unsigned int *genp"
+.Ft unsigned int
+.Fn pc_sprod_enter "struct pc_lock *pcl"
+.Ft void
+.Fn pc_sprod_leave "struct pc_lock *pcl" "unsigned int gen"
+.Ft unsigned int
+.Fn pc_mprod_enter "struct pc_lock *pcl"
+.Ft void
+.Fn pc_mprod_leave "struct pc_lock *pcl" "unsigned int gen"
+.Fn PC_LOCK_INITIALIZER
+.Sh DESCRIPTION
+The producer/consumer lock functions provide mechanisms for a
+consumer to read data without blocking or delaying another CPU or
+an interrupt when it is updating or producing data.
+A variant of the producer locking functions provides mutual exclusion
+between multiple producers.
+.Pp
+This is implemented by having producers version the protected data
+with a generation number.
+Consumers of the data compare the generation number at the start
+of the critical section to the generation number at the end, and
+must retry reading the data if the generation number has changed.
+.Pp
+The
+.Fn pc_lock_init
+function is used to initialise the producer/consumer lock pointed to by
+.Fa pcl .
+.Pp
+A producer/consumer lock declaration may be initialised with the
+.Fn PC_LOCK_INITIALIZER
+macro.
+.Ss Consumer API
+.Fn pc_cons_enter
+reads the current generation number from
+.Fa pcl
+and stores it in the memory provided by the caller via
+.Fa genp .
+.Pp
+.Fn pc_cons_leave
+compares the generation number in
+.Fa pcl
+with the value stored in
+.Fa genp
+by
+.Fn pc_cons_enter
+at the start of the critical section, and returns whether the reads
+within the critical section need to be retried because the data has
+been updated by the producer.
+.Ss Single Producer API
+The single producer API is optimised for updating data from code
+.Pp
+.Fn pc_sprod_enter
+marks the beginning of a single producer critical section for the
+.Fa pcl
+producer/consumer lock.
+.Pp
+.Fn pc_sprod_leave
+marks the end of a single producer critical section for the
+.Fa pcl
+producer/consumer lock.
+The
+.Fa gen
+argument must be the value returned from the preceding
+.Fn pc_sprod_enter
+call.
+.Ss Multiple Producer API
+The multiple producer API provides mutual exclusion between multiple
+CPUs entering the critical section concurrently.
+Unlike
+.Xr mtx_enter 9 ,
+the multiple producer does not prevent preemption by interrupts,
+it only provides mutual exclusion between CPUs.
+If protection from preemption is required,
+.Xr splraise 9
+can be used to protect the producer critical section.
+.Pp
+.Fn pc_mprod_enter
+marks the beginning of a single producer critical section for the
+.Fa pcl
+producer/consumer lock.
+.Pp
+.Fn pc_mprod_leave
+marks the end of a single producer critical section for the
+.Fa pcl
+producer/consumer lock.
+The
+.Fa gen
+argument must be the value returned from the preceding
+.Fn pc_mprod_enter
+call.
+.Pp
+On uniprocessor kernels the multiple producer API is aliased to the
+single producer API.
+.Sh CONTEXT
+.Fn pc_lock_init ,
+.Fn pc_cons_enter ,
+.Fn pc_cons_leave ,
+.Fn pc_sprod_enter ,
+.Fn pc_sprod_leave ,
+.Fn pc_mprod_enter ,
+.Fn pc_mprod_leave ,
+can be called during autoconf, from process context, or from interrupt context.
+.Pp
+.Fn pc_sprod_enter ,
+.Fn pc_sprod_leave ,
+.Fn pc_mprod_enter ,
+and
+.Fn pc_mprod_leave
+may run concurrently with (ie, on another CPU to)
+or preempt (ie, run at a higher interrupt level) than
+.Fn pc_cons_enter
+and
+.Fn pc_cons_leave .
+.Pp
+.Fn pc_sprod_enter ,
+.Fn pc_sprod_leave ,
+.Fn pc_mprod_enter ,
+and
+.Fn pc_mprod_leave
+must not be preempted or interrupted by the producer or consumer
+API for the same lock.
+.Sh RETURN VALUES
+.Fn pc_cons_leave
+returns 0 if the critical section did not overlap with an update
+from a producer, or non-zero if the critical section must be retried.
+.Sh EXAMPLES
+To produce or update data:
+.Bd -literal -offset indent
+struct pc_lock pc = PC_LOCK_INITIALIZER();
+
+void
+producer(void)
+{
+ unsigned int gen;
+
+ gen = pc_sprod_enter(&pc);
+ /* update data */
+ pc_sprod_leave(&pc, gen);
+}
+.Ed
+.Pp
+A consistent read of the data from a consumer:
+.Bd -literal -offset indent
+void
+consumer(void)
+{
+ unsigned int gen;
+
+ pc_cons_enter(&pc, &gen);
+ do {
+ /* read data */
+ } while (pc_cons_leave(&pc, &gen) != 0);
+}
+.Ed
+.Sh SEE ALSO
+.Xr mutex 9 ,
+.Xr splraise 9
+.Sh HISTORY
+The
+.Nm
+functions first appeared in
+.Ox 7.8 .
+.Sh AUTHORS
+The
+.Nm
+functions were written by
+.An David Gwynne Aq Mt dlg@openbsd.org .
+.Sh CAVEATS
+Updates must be produced infrequently enough to allow time for
+consumers to be able to get a consistent read without looping too
+often.
+.Pp
+Because consuming the data may loop when retrying, care must be
+taken to avoid side effects from reading the data multiple times,
+eg, when accumulating values.
Index: sys/kern/kern_clock.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_clock.c,v
diff -u -p -r1.125 kern_clock.c
--- sys/kern/kern_clock.c 2 May 2025 05:04:38 -0000 1.125
+++ sys/kern/kern_clock.c 4 May 2025 07:18:11 -0000
@@ -270,6 +270,7 @@ statclock(struct clockrequest *cr, void
struct process *pr;
int tu_tick = -1;
int cp_time;
+ unsigned int gen;
if (statclock_is_randomized) {
count = clockrequest_advance_random(cr, statclock_min,
@@ -313,7 +314,9 @@ statclock(struct clockrequest *cr, void
cp_time = CP_SPIN;
}
+ gen = pc_sprod_enter(&spc->spc_cp_time_lock);
spc->spc_cp_time[cp_time] += count;
+ pc_sprod_leave(&spc->spc_cp_time_lock, gen);
if (p != NULL) {
p->p_cpticks += count;
@@ -322,7 +325,7 @@ statclock(struct clockrequest *cr, void
struct vmspace *vm = p->p_vmspace;
struct tusage *tu = &p->p_tu;
- tu_enter(tu);
+ gen = tu_enter(tu);
tu->tu_ticks[tu_tick] += count;
/* maxrss is handled by uvm */
@@ -334,7 +337,7 @@ statclock(struct clockrequest *cr, void
tu->tu_isrss +=
(vm->vm_ssize << (PAGE_SHIFT - 10)) * count;
}
- tu_leave(tu);
+ tu_leave(tu, gen);
}
/*
Index: sys/kern/kern_exec.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_exec.c,v
diff -u -p -r1.262 kern_exec.c
--- sys/kern/kern_exec.c 17 Feb 2025 10:07:10 -0000 1.262
+++ sys/kern/kern_exec.c 4 May 2025 07:18:11 -0000
@@ -699,7 +699,7 @@ sys_execve(struct proc *p, void *v, regi
/* reset CPU time usage for the thread, but not the process */
timespecclear(&p->p_tu.tu_runtime);
p->p_tu.tu_uticks = p->p_tu.tu_sticks = p->p_tu.tu_iticks = 0;
- p->p_tu.tu_gen = 0;
+ pc_lock_init(&p->p_tu.tu_pcl);
memset(p->p_name, 0, sizeof p->p_name);
Index: sys/kern/kern_lock.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_lock.c,v
diff -u -p -r1.75 kern_lock.c
--- sys/kern/kern_lock.c 3 Jul 2024 01:36:50 -0000 1.75
+++ sys/kern/kern_lock.c 4 May 2025 07:18:11 -0000
@@ -24,6 +24,7 @@
#include <sys/atomic.h>
#include <sys/witness.h>
#include <sys/mutex.h>
+#include <sys/pclock.h>
#include <ddb/db_output.h>
@@ -418,3 +419,102 @@ _mtx_init_flags(struct mutex *m, int ipl
_mtx_init(m, ipl);
}
#endif /* WITNESS */
+
+void
+pc_lock_init(struct pc_lock *pcl)
+{
+ pcl->pcl_gen = 0;
+}
+
+unsigned int
+pc_sprod_enter(struct pc_lock *pcl)
+{
+ unsigned int gen;
+
+ gen = pcl->pcl_gen;
+ pcl->pcl_gen = ++gen;
+ membar_producer();
+
+ return (gen);
+}
+
+void
+pc_sprod_leave(struct pc_lock *pcl, unsigned int gen)
+{
+ membar_producer();
+ pcl->pcl_gen = ++gen;
+}
+
+#ifdef MULTIPROCESSOR
+unsigned int
+pc_mprod_enter(struct pc_lock *pcl)
+{
+ unsigned int gen, ngen, ogen;
+
+ gen = pcl->pcl_gen;
+ for (;;) {
+ while (gen & 1) {
+ CPU_BUSY_CYCLE();
+ gen = pcl->pcl_gen;
+ }
+
+ ngen = 1 + gen;
+ ogen = atomic_cas_uint(&pcl->pcl_gen, gen, ngen);
+ if (gen == ogen)
+ break;
+
+ CPU_BUSY_CYCLE();
+ gen = ogen;
+ }
+
+ membar_enter_after_atomic();
+ return (ngen);
+}
+
+void
+pc_mprod_leave(struct pc_lock *pcl, unsigned int gen)
+{
+ membar_exit();
+ pcl->pcl_gen = ++gen;
+}
+#else /* MULTIPROCESSOR */
+unsigned int pc_mprod_enter(struct pc_lock *)
+ __attribute__((alias("pc_sprod_enter")));
+void pc_mprod_leave(struct pc_lock *, unsigned int)
+ __attribute__((alias("pc_sprod_leave")));
+#endif /* MULTIPROCESSOR */
+
+void
+pc_cons_enter(struct pc_lock *pcl, unsigned int *genp)
+{
+ unsigned int gen;
+
+ gen = pcl->pcl_gen;
+ while (gen & 1) {
+ CPU_BUSY_CYCLE();
+ gen = pcl->pcl_gen;
+ }
+
+ membar_consumer();
+ *genp = gen;
+}
+
+int
+pc_cons_leave(struct pc_lock *pcl, unsigned int *genp)
+{
+ unsigned int gen;
+
+ membar_consumer();
+
+ gen = pcl->pcl_gen;
+ if (gen & 1) {
+ do {
+ CPU_BUSY_CYCLE();
+ gen = pcl->pcl_gen;
+ } while (gen & 1);
+ } else if (gen == *genp)
+ return (0);
+
+ *genp = gen;
+ return (EBUSY);
+}
Index: sys/kern/kern_resource.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_resource.c,v
diff -u -p -r1.94 kern_resource.c
--- sys/kern/kern_resource.c 2 May 2025 05:04:38 -0000 1.94
+++ sys/kern/kern_resource.c 4 May 2025 07:18:11 -0000
@@ -63,7 +63,7 @@ struct plimit *lim_copy(struct plimit *)
struct plimit *lim_write_begin(void);
void lim_write_commit(struct plimit *);
-void tuagg_sumup(struct tusage *, const struct tusage *);
+void tuagg_sumup(struct tusage *, struct tusage *);
/*
* Patchable maximum data and stack limits.
@@ -369,28 +369,15 @@ sys_getrlimit(struct proc *p, void *v, r
/* Add the counts from *from to *tu, ensuring a consistent read of *from. */
void
-tuagg_sumup(struct tusage *tu, const struct tusage *from)
+tuagg_sumup(struct tusage *tu, struct tusage *from)
{
struct tusage tmp;
- uint64_t enter, leave;
+ unsigned int gen;
- enter = from->tu_gen;
- for (;;) {
- /* the generation number is odd during an update */
- while (enter & 1) {
- CPU_BUSY_CYCLE();
- enter = from->tu_gen;
- }
-
- membar_consumer();
+ pc_cons_enter(&from->tu_pcl, &gen);
+ do {
tmp = *from;
- membar_consumer();
- leave = from->tu_gen;
-
- if (enter == leave)
- break;
- enter = leave;
- }
+ } while (pc_cons_leave(&from->tu_pcl, &gen) != 0);
tu->tu_uticks += tmp.tu_uticks;
tu->tu_sticks += tmp.tu_sticks;
@@ -433,12 +420,14 @@ tuagg_get_process(struct tusage *tu, str
void
tuagg_add_process(struct process *pr, struct proc *p)
{
+ unsigned int gen;
+
MUTEX_ASSERT_LOCKED(&pr->ps_mtx);
KASSERT(curproc == p || p->p_stat == SDEAD);
- tu_enter(&pr->ps_tu);
+ gen = tu_enter(&pr->ps_tu);
tuagg_sumup(&pr->ps_tu, &p->p_tu);
- tu_leave(&pr->ps_tu);
+ tu_leave(&pr->ps_tu, gen);
/* Now reset CPU time usage for the thread. */
timespecclear(&p->p_tu.tu_runtime);
@@ -452,6 +441,7 @@ tuagg_add_runtime(void)
struct schedstate_percpu *spc = &curcpu()->ci_schedstate;
struct proc *p = curproc;
struct timespec ts, delta;
+ unsigned int gen;
/*
* Compute the amount of time during which the current
@@ -472,9 +462,9 @@ tuagg_add_runtime(void)
}
/* update spc_runtime */
spc->spc_runtime = ts;
- tu_enter(&p->p_tu);
+ gen = tu_enter(&p->p_tu);
timespecadd(&p->p_tu.tu_runtime, &delta, &p->p_tu.tu_runtime);
- tu_leave(&p->p_tu);
+ tu_leave(&p->p_tu, gen);
}
/*
Index: sys/kern/kern_sysctl.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_sysctl.c,v
diff -u -p -r1.465 kern_sysctl.c
--- sys/kern/kern_sysctl.c 27 Apr 2025 00:58:55 -0000 1.465
+++ sys/kern/kern_sysctl.c 4 May 2025 07:18:11 -0000
@@ -172,6 +172,8 @@ int hw_sysctl_locked(int *, u_int, void
int (*cpu_cpuspeed)(int *);
+static void sysctl_ci_cp_time(struct cpu_info *, uint64_t *);
+
/*
* Lock to avoid too many processes vslocking a large amount of memory
* at the same time.
@@ -682,11 +684,15 @@ kern_sysctl_locked(int *name, u_int name
memset(cp_time, 0, sizeof(cp_time));
CPU_INFO_FOREACH(cii, ci) {
+ uint64_t ci_cp_time[CPUSTATES];
+
if (!cpu_is_online(ci))
continue;
+
n++;
+ sysctl_ci_cp_time(ci, ci_cp_time);
for (i = 0; i < CPUSTATES; i++)
- cp_time[i] += ci->ci_schedstate.spc_cp_time[i];
+ cp_time[i] += ci_cp_time[i];
}
for (i = 0; i < CPUSTATES; i++)
@@ -2793,12 +2799,27 @@ sysctl_sensors(int *name, u_int namelen,
}
#endif /* SMALL_KERNEL */
+static void
+sysctl_ci_cp_time(struct cpu_info *ci, uint64_t *cp_time)
+{
+ struct schedstate_percpu *spc = &ci->ci_schedstate;
+ unsigned int gen;
+
+ pc_cons_enter(&spc->spc_cp_time_lock, &gen);
+ do {
+ int i;
+ for (i = 0; i < CPUSTATES; i++)
+ cp_time[i] = spc->spc_cp_time[i];
+ } while (pc_cons_leave(&spc->spc_cp_time_lock, &gen) != 0);
+}
+
int
sysctl_cptime2(int *name, u_int namelen, void *oldp, size_t *oldlenp,
void *newp, size_t newlen)
{
CPU_INFO_ITERATOR cii;
struct cpu_info *ci;
+ uint64_t cp_time[CPUSTATES];
int found = 0;
if (namelen != 1)
@@ -2813,9 +2834,10 @@ sysctl_cptime2(int *name, u_int namelen,
if (!found)
return (ENOENT);
+ sysctl_ci_cp_time(ci, cp_time);
+
return (sysctl_rdstruct(oldp, oldlenp, newp,
- &ci->ci_schedstate.spc_cp_time,
- sizeof(ci->ci_schedstate.spc_cp_time)));
+ cp_time, sizeof(cp_time)));
}
#if NAUDIO > 0
@@ -2881,7 +2903,7 @@ sysctl_cpustats(int *name, u_int namelen
return (ENOENT);
memset(&cs, 0, sizeof cs);
- memcpy(&cs.cs_time, &ci->ci_schedstate.spc_cp_time, sizeof(cs.cs_time));
+ sysctl_ci_cp_time(ci, cs.cs_time);
cs.cs_flags = 0;
if (cpu_is_online(ci))
cs.cs_flags |= CPUSTATS_ONLINE;
Index: sys/kern/sched_bsd.c
===================================================================
RCS file: /cvs/src/sys/kern/sched_bsd.c,v
diff -u -p -r1.99 sched_bsd.c
--- sys/kern/sched_bsd.c 10 Mar 2025 09:28:56 -0000 1.99
+++ sys/kern/sched_bsd.c 4 May 2025 07:18:11 -0000
@@ -585,6 +585,7 @@ setperf_auto(void *v)
CPU_INFO_ITERATOR cii;
struct cpu_info *ci;
uint64_t idle, total, allidle = 0, alltotal = 0;
+ unsigned int gen;
if (!perfpolicy_dynamic())
return;
@@ -609,14 +610,23 @@ setperf_auto(void *v)
return;
}
CPU_INFO_FOREACH(cii, ci) {
+ struct schedstate_percpu *spc;
+
if (!cpu_is_online(ci))
continue;
- total = 0;
- for (i = 0; i < CPUSTATES; i++) {
- total += ci->ci_schedstate.spc_cp_time[i];
- }
+
+ spc = &ci->ci_schedstate;
+ pc_cons_enter(&spc->spc_cp_time_lock, &gen);
+ do {
+ total = 0;
+ for (i = 0; i < CPUSTATES; i++) {
+ total += spc->spc_cp_time[i];
+ }
+ idle = spc->spc_cp_time[CP_IDLE];
+ } while (pc_cons_leave(&spc->spc_cp_time_lock, &gen) != 0);
+
total -= totalticks[j];
- idle = ci->ci_schedstate.spc_cp_time[CP_IDLE] - idleticks[j];
+ idle -= idleticks[j];
if (idle < total / 3)
speedup = 1;
alltotal += total;
Index: sys/sys/pclock.h
===================================================================
RCS file: sys/sys/pclock.h
diff -N sys/sys/pclock.h
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ sys/sys/pclock.h 4 May 2025 07:18:11 -0000
@@ -0,0 +1,49 @@
+/* $OpenBSD$ */
+
+/*
+ * Copyright (c) 2023 David Gwynne <dlg@openbsd.org>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#ifndef _SYS_PCLOCK_H
+#define _SYS_PCLOCK_H
+
+#include <sys/_lock.h>
+
+struct pc_lock {
+ volatile unsigned int pcl_gen;
+};
+
+#ifdef _KERNEL
+
+#define PC_LOCK_INITIALIZER() { .pcl_gen = 0 }
+
+void pc_lock_init(struct pc_lock *);
+
+/* single (non-interlocking) producer */
+unsigned int pc_sprod_enter(struct pc_lock *);
+void pc_sprod_leave(struct pc_lock *, unsigned int);
+
+/* multiple (interlocking) producers */
+unsigned int pc_mprod_enter(struct pc_lock *);
+void pc_mprod_leave(struct pc_lock *, unsigned int);
+
+/* consumer */
+void pc_cons_enter(struct pc_lock *, unsigned int *);
+__warn_unused_result int
+ pc_cons_leave(struct pc_lock *, unsigned int *);
+
+#endif /* _KERNEL */
+
+#endif /* _SYS_PCLOCK_H */
Index: sys/sys/proc.h
===================================================================
RCS file: /cvs/src/sys/sys/proc.h,v
diff -u -p -r1.387 proc.h
--- sys/sys/proc.h 2 May 2025 05:04:38 -0000 1.387
+++ sys/sys/proc.h 4 May 2025 07:18:11 -0000
@@ -51,6 +51,7 @@
#include <sys/rwlock.h> /* For struct rwlock */
#include <sys/sigio.h> /* For struct sigio */
#include <sys/refcnt.h> /* For struct refcnt */
+#include <sys/pclock.h>
#ifdef _KERNEL
#include <sys/atomic.h>
@@ -91,8 +92,8 @@ struct pgrp {
* Each thread is immediately accumulated here. For processes only the
* time of exited threads is accumulated and to get the proper process
* time usage tuagg_get_process() needs to be called.
- * Accounting of threads is done lockless by curproc using the tu_gen
- * generation counter. Code should use tu_enter() and tu_leave() for this.
+ * Accounting of threads is done lockless by curproc using the tu_pcl
+ * pc_lock. Code should use tu_enter() and tu_leave() for this.
* The process ps_tu structure is locked by the ps_mtx.
*/
#define TU_UTICKS 0 /* Statclock hits in user mode. */
@@ -101,7 +102,7 @@ struct pgrp {
#define TU_TICKS_COUNT 3
struct tusage {
- uint64_t tu_gen; /* generation counter */
+ struct pc_lock tu_pcl;
uint64_t tu_ticks[TU_TICKS_COUNT];
#define tu_uticks tu_ticks[TU_UTICKS]
#define tu_sticks tu_ticks[TU_STICKS]
@@ -125,8 +126,6 @@ struct tusage {
* run-time information needed by threads.
*/
#ifdef __need_process
-struct futex;
-LIST_HEAD(futex_list, futex);
struct proc;
struct tslpentry;
TAILQ_HEAD(tslpqueue, tslpentry);
@@ -187,7 +186,6 @@ struct process {
struct vmspace *ps_vmspace; /* Address space */
pid_t ps_pid; /* [I] Process identifier. */
- struct futex_list ps_ftlist; /* futexes attached to this process */
struct tslpqueue ps_tslpqueue; /* [p] queue of threads in thrsleep */
struct rwlock ps_lock; /* per-process rwlock */
struct mutex ps_mtx; /* per-process mutex */
@@ -353,9 +351,6 @@ struct proc {
struct process *p_p; /* [I] The process of this thread. */
TAILQ_ENTRY(proc) p_thr_link; /* [K|m] Threads in a process linkage. */
- TAILQ_ENTRY(proc) p_fut_link; /* Threads in a futex linkage. */
- struct futex *p_futex; /* Current sleeping futex. */
-
/* substructures: */
struct filedesc *p_fd; /* copy of p_p->ps_fd */
struct vmspace *p_vmspace; /* [I] copy of p_p->ps_vmspace */
@@ -655,18 +650,16 @@ void cpuset_complement(struct cpuset *,
int cpuset_cardinality(struct cpuset *);
struct cpu_info *cpuset_first(struct cpuset *);
-static inline void
+static inline unsigned int
tu_enter(struct tusage *tu)
{
- ++tu->tu_gen; /* make the generation number odd */
- membar_producer();
+ return pc_sprod_enter(&tu->tu_pcl);
}
static inline void
-tu_leave(struct tusage *tu)
+tu_leave(struct tusage *tu, unsigned int gen)
{
- membar_producer();
- ++tu->tu_gen; /* make the generation number even again */
+ pc_sprod_leave(&tu->tu_pcl, gen);
}
#endif /* _KERNEL */
Index: sys/sys/sched.h
===================================================================
RCS file: /cvs/src/sys/sys/sched.h,v
diff -u -p -r1.73 sched.h
--- sys/sys/sched.h 8 Jul 2024 14:46:47 -0000 1.73
+++ sys/sys/sched.h 4 May 2025 07:18:11 -0000
@@ -97,6 +97,7 @@ struct cpustats {
#include <sys/clockintr.h>
#include <sys/queue.h>
+#include <sys/pclock.h>
#define SCHED_NQS 32 /* 32 run queues. */
@@ -112,6 +113,7 @@ struct schedstate_percpu {
struct timespec spc_runtime; /* time curproc started running */
volatile int spc_schedflags; /* flags; see below */
u_int spc_schedticks; /* ticks for schedclock() */
+ struct pc_lock spc_cp_time_lock;
u_int64_t spc_cp_time[CPUSTATES]; /* CPU state statistics */
u_char spc_curpriority; /* usrpri of curproc */
producer/consumer locking