From: Vitaliy Makkoveev Subject: Re: producer/consumer locking To: David Gwynne Cc: tech@openbsd.org Date: Sun, 4 May 2025 17:33:08 +0300 Like this! I wanted to have something like mtx_enter_read() for a long time. > On 4 May 2025, at 10:20, David Gwynne wrote: > > this provides coordination between things producing and consuming > data when you don't want to block or delay the thing producing data, > but it's ok for make the consumer do more work to compensate for that. > > the mechanism is a generalisation of the coordination used in the mp > counters api and the some of the process accounting code. data updated > by a producer is versioned, and the consumer reads the version on each > side of the critical section to see if it's been updated. if the > producer has updated the version, then the consumer has to retry. > > the diff includes the migration of the process accounting to the > generalised api, and adds it to the cpu state counters on each cpu. > it's now possible to get a consistent snapshot of the cpu counters, even > if they were preempted by statclock. > > i also have a pf diff that uses these. it allows pf to maintain counters > that userland can read without blocking the execution of pf. handy. > > the only thing im worried about is the use of the alias function > attributes for !MULTIPROCESSOR kernels, but we use those in libc on all > our archs/compilers and it seems to be fine. > > the manpage looks like this: > > PC_LOCK_INIT(9) Kernel Developer's Manual PC_LOCK_INIT(9) > > NAME > pc_lock_init, pc_cons_enter, pc_cons_leave, pc_sprod_enter, > pc_sprod_leave, pc_mprod_enter, pc_mprod_leave, PC_LOCK_INITIALIZER - > producer/consumer locks > > SYNOPSIS > #include > > void > pc_lock_init(struct pc_lock *pcl); > > void > pc_cons_enter(struct pc_lock *pcl, unsigned int *genp); > > int > pc_cons_leave(struct pc_lock *pcl, unsigned int *genp); > > unsigned int > pc_sprod_enter(struct pc_lock *pcl); > > void > pc_sprod_leave(struct pc_lock *pcl, unsigned int gen); > > unsigned int > pc_mprod_enter(struct pc_lock *pcl); > > void > pc_mprod_leave(struct pc_lock *pcl, unsigned int gen); > > PC_LOCK_INITIALIZER(); > > DESCRIPTION > The producer/consumer lock functions provide mechanisms for a consumer to > read data without blocking or delaying another CPU or an interrupt when > it is updating or producing data. A variant of the producer locking > functions provides mutual exclusion between concurrent producers. > > This is implemented by having producers version the protected data with a > generation number. Consumers of the data compare the generation number > at the start of the critical section to the generation number at the end, > and must retry reading the data if the generation number has changed. > > The pc_lock_init() function is used to initialise the producer/consumer > lock pointed to by pcl. > > A producer/consumer lock declaration may be initialised with the > PC_LOCK_INITIALIZER() macro. > > Consumer API > pc_cons_enter() reads the current generation number from pcl and stores > it in the memory provided by the caller via genp. > > pc_cons_leave() compares the generation number in pcl with the value > stored in genp by pc_cons_enter() at the start of the critical section, > and returns whether the reads within the critical section need to be > retried because the data has been updated by the producer. > > Single Producer API > The single producer API is optimised for updating data from code > > pc_sprod_enter() marks the beginning of a single producer critical > section for the pcl producer/consumer lock. > > pc_sprod_leave() marks the end of a single producer critical section for > the pcl producer/consumer lock. The gen argument must be the value > returned from the preceding pc_sprod_enter() call. > > Multiple Producer API > The multiple producer API provides mutual exclusion between multiple CPUs > entering the critical section concurrently. Unlike mtx_enter(9), the > multiple producer does not prevent preemption by interrupts, it only > provides mutual exclusion between CPUs. If protection from preemption is > required, splraise(9) can be used to protect the producer critical > section. > > pc_mprod_enter() marks the beginning of a single producer critical > section for the pcl producer/consumer lock. > > pc_mprod_leave() marks the end of a single producer critical section for > the pcl producer/consumer lock. The gen argument must be the value > returned from the preceding pc_mprod_enter() call. > > On uniprocessor kernels the multiple producer API is aliased to the > single producer API. > > CONTEXT > pc_lock_init(), pc_cons_enter(), pc_cons_leave(), pc_sprod_enter(), > pc_sprod_leave(), pc_mprod_enter(), pc_mprod_leave(), can be called > during autoconf, from process context, or from interrupt context. > > pc_sprod_enter(), pc_sprod_leave(), pc_mprod_enter(), and > pc_mprod_leave() may run concurrently with (ie, on another CPU to) or > preempt (ie, run at a higher interrupt level) than pc_cons_enter() and > pc_cons_leave(). > > pc_sprod_enter(), pc_sprod_leave(), pc_mprod_enter(), and > pc_mprod_leave() must not be preempted or interrupted by the producer or > consumer API for the same lock. > > RETURN VALUES > pc_cons_leave() returns 0 if the critical section did not overlap with an > update from a producer, or non-zero if the critical section must be > retried. > > EXAMPLES > To produce or update data: > > struct pc_lock pc = PC_LOCK_INITIALIZER(); > > void > producer(void) > { > unsigned int gen; > > gen = pc_sprod_enter(&pc); > /* update data */ > pc_sprod_leave(&pc, gen); > } > > A consistent read of the data from a consumer: > > void > consumer(void) > { > unsigned int gen; > > pc_cons_enter(&pc, &gen); > do { > /* read data */ > } while (pc_cons_leave(&pc, &gen) != 0); > } > > SEE ALSO > mutex(9), splraise(9) > > HISTORY > The pc_lock_init functions first appeared in OpenBSD 7.8. > > AUTHORS > The pc_lock_init functions were written by David Gwynne > . > > CAVEATS > Updates must be produced infrequently enough to allow time for consumers > to be able to get a consistent read without looping too often. > > Because consuming the data may loop when retrying, care must be taken to > avoid side effects from reading the data multiple times, eg, when > accumulating values. > > ok? > > Index: share/man/man9/Makefile > =================================================================== > RCS file: /cvs/src/share/man/man9/Makefile,v > diff -u -p -r1.310 Makefile > --- share/man/man9/Makefile 24 Feb 2024 16:21:32 -0000 1.310 > +++ share/man/man9/Makefile 4 May 2025 07:18:11 -0000 > @@ -29,7 +29,8 @@ MAN= aml_evalnode.9 atomic_add_int.9 ato > malloc.9 membar_sync.9 memcmp.9 mbuf.9 mbuf_tags.9 md5.9 mi_switch.9 \ > microtime.9 ml_init.9 mq_init.9 mutex.9 \ > namei.9 \ > - panic.9 pci_conf_read.9 pci_mapreg_map.9 pci_intr_map.9 physio.9 \ > + panic.9 pci_conf_read.9 pci_mapreg_map.9 pci_intr_map.9 \ > + pc_lock_init.9 physio.9 \ > pmap.9 pool.9 pool_cache_init.9 ppsratecheck.9 printf.9 psignal.9 \ > RBT_INIT.9 \ > radio.9 arc4random.9 rasops.9 ratecheck.9 refcnt_init.9 resettodr.9 \ > Index: share/man/man9/pc_lock_init.9 > =================================================================== > RCS file: share/man/man9/pc_lock_init.9 > diff -N share/man/man9/pc_lock_init.9 > --- /dev/null 1 Jan 1970 00:00:00 -0000 > +++ share/man/man9/pc_lock_init.9 4 May 2025 07:18:11 -0000 > @@ -0,0 +1,212 @@ > +.\" $OpenBSD$ > +.\" > +.\" Copyright (c) 2025 David Gwynne > +.\" All rights reserved. > +.\" > +.\" Permission to use, copy, modify, and distribute this software for any > +.\" purpose with or without fee is hereby granted, provided that the above > +.\" copyright notice and this permission notice appear in all copies. > +.\" > +.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > +.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > +.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > +.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > +.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > +.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > +.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > +.\" > +.Dd $Mdocdate: November 4 2019 $ > +.Dt PC_LOCK_INIT 9 > +.Os > +.Sh NAME > +.Nm pc_lock_init , > +.Nm pc_cons_enter , > +.Nm pc_cons_leave , > +.Nm pc_sprod_enter , > +.Nm pc_sprod_leave , > +.Nm pc_mprod_enter , > +.Nm pc_mprod_leave , > +.Nm PC_LOCK_INITIALIZER > +.Nd producer/consumer locks > +.Sh SYNOPSIS > +.In sys/pclock.h > +.Ft void > +.Fn pc_lock_init "struct pc_lock *pcl" > +.Ft void > +.Fn pc_cons_enter "struct pc_lock *pcl" "unsigned int *genp" > +.Ft int > +.Fn pc_cons_leave "struct pc_lock *pcl" "unsigned int *genp" > +.Ft unsigned int > +.Fn pc_sprod_enter "struct pc_lock *pcl" > +.Ft void > +.Fn pc_sprod_leave "struct pc_lock *pcl" "unsigned int gen" > +.Ft unsigned int > +.Fn pc_mprod_enter "struct pc_lock *pcl" > +.Ft void > +.Fn pc_mprod_leave "struct pc_lock *pcl" "unsigned int gen" > +.Fn PC_LOCK_INITIALIZER > +.Sh DESCRIPTION > +The producer/consumer lock functions provide mechanisms for a > +consumer to read data without blocking or delaying another CPU or > +an interrupt when it is updating or producing data. > +A variant of the producer locking functions provides mutual exclusion > +between multiple producers. > +.Pp > +This is implemented by having producers version the protected data > +with a generation number. > +Consumers of the data compare the generation number at the start > +of the critical section to the generation number at the end, and > +must retry reading the data if the generation number has changed. > +.Pp > +The > +.Fn pc_lock_init > +function is used to initialise the producer/consumer lock pointed to by > +.Fa pcl . > +.Pp > +A producer/consumer lock declaration may be initialised with the > +.Fn PC_LOCK_INITIALIZER > +macro. > +.Ss Consumer API > +.Fn pc_cons_enter > +reads the current generation number from > +.Fa pcl > +and stores it in the memory provided by the caller via > +.Fa genp . > +.Pp > +.Fn pc_cons_leave > +compares the generation number in > +.Fa pcl > +with the value stored in > +.Fa genp > +by > +.Fn pc_cons_enter > +at the start of the critical section, and returns whether the reads > +within the critical section need to be retried because the data has > +been updated by the producer. > +.Ss Single Producer API > +The single producer API is optimised for updating data from code > +.Pp > +.Fn pc_sprod_enter > +marks the beginning of a single producer critical section for the > +.Fa pcl > +producer/consumer lock. > +.Pp > +.Fn pc_sprod_leave > +marks the end of a single producer critical section for the > +.Fa pcl > +producer/consumer lock. > +The > +.Fa gen > +argument must be the value returned from the preceding > +.Fn pc_sprod_enter > +call. > +.Ss Multiple Producer API > +The multiple producer API provides mutual exclusion between multiple > +CPUs entering the critical section concurrently. > +Unlike > +.Xr mtx_enter 9 , > +the multiple producer does not prevent preemption by interrupts, > +it only provides mutual exclusion between CPUs. > +If protection from preemption is required, > +.Xr splraise 9 > +can be used to protect the producer critical section. > +.Pp > +.Fn pc_mprod_enter > +marks the beginning of a single producer critical section for the > +.Fa pcl > +producer/consumer lock. > +.Pp > +.Fn pc_mprod_leave > +marks the end of a single producer critical section for the > +.Fa pcl > +producer/consumer lock. > +The > +.Fa gen > +argument must be the value returned from the preceding > +.Fn pc_mprod_enter > +call. > +.Pp > +On uniprocessor kernels the multiple producer API is aliased to the > +single producer API. > +.Sh CONTEXT > +.Fn pc_lock_init , > +.Fn pc_cons_enter , > +.Fn pc_cons_leave , > +.Fn pc_sprod_enter , > +.Fn pc_sprod_leave , > +.Fn pc_mprod_enter , > +.Fn pc_mprod_leave , > +can be called during autoconf, from process context, or from interrupt context. > +.Pp > +.Fn pc_sprod_enter , > +.Fn pc_sprod_leave , > +.Fn pc_mprod_enter , > +and > +.Fn pc_mprod_leave > +may run concurrently with (ie, on another CPU to) > +or preempt (ie, run at a higher interrupt level) than > +.Fn pc_cons_enter > +and > +.Fn pc_cons_leave . > +.Pp > +.Fn pc_sprod_enter , > +.Fn pc_sprod_leave , > +.Fn pc_mprod_enter , > +and > +.Fn pc_mprod_leave > +must not be preempted or interrupted by the producer or consumer > +API for the same lock. > +.Sh RETURN VALUES > +.Fn pc_cons_leave > +returns 0 if the critical section did not overlap with an update > +from a producer, or non-zero if the critical section must be retried. > +.Sh EXAMPLES > +To produce or update data: > +.Bd -literal -offset indent > +struct pc_lock pc = PC_LOCK_INITIALIZER(); > + > +void > +producer(void) > +{ > + unsigned int gen; > + > + gen = pc_sprod_enter(&pc); > + /* update data */ > + pc_sprod_leave(&pc, gen); > +} > +.Ed > +.Pp > +A consistent read of the data from a consumer: > +.Bd -literal -offset indent > +void > +consumer(void) > +{ > + unsigned int gen; > + > + pc_cons_enter(&pc, &gen); > + do { > + /* read data */ > + } while (pc_cons_leave(&pc, &gen) != 0); > +} > +.Ed > +.Sh SEE ALSO > +.Xr mutex 9 , > +.Xr splraise 9 > +.Sh HISTORY > +The > +.Nm > +functions first appeared in > +.Ox 7.8 . > +.Sh AUTHORS > +The > +.Nm > +functions were written by > +.An David Gwynne Aq Mt dlg@openbsd.org . > +.Sh CAVEATS > +Updates must be produced infrequently enough to allow time for > +consumers to be able to get a consistent read without looping too > +often. > +.Pp > +Because consuming the data may loop when retrying, care must be > +taken to avoid side effects from reading the data multiple times, > +eg, when accumulating values. > Index: sys/kern/kern_clock.c > =================================================================== > RCS file: /cvs/src/sys/kern/kern_clock.c,v > diff -u -p -r1.125 kern_clock.c > --- sys/kern/kern_clock.c 2 May 2025 05:04:38 -0000 1.125 > +++ sys/kern/kern_clock.c 4 May 2025 07:18:11 -0000 > @@ -270,6 +270,7 @@ statclock(struct clockrequest *cr, void > struct process *pr; > int tu_tick = -1; > int cp_time; > + unsigned int gen; > > if (statclock_is_randomized) { > count = clockrequest_advance_random(cr, statclock_min, > @@ -313,7 +314,9 @@ statclock(struct clockrequest *cr, void > cp_time = CP_SPIN; > } > > + gen = pc_sprod_enter(&spc->spc_cp_time_lock); > spc->spc_cp_time[cp_time] += count; > + pc_sprod_leave(&spc->spc_cp_time_lock, gen); > > if (p != NULL) { > p->p_cpticks += count; > @@ -322,7 +325,7 @@ statclock(struct clockrequest *cr, void > struct vmspace *vm = p->p_vmspace; > struct tusage *tu = &p->p_tu; > > - tu_enter(tu); > + gen = tu_enter(tu); > tu->tu_ticks[tu_tick] += count; > > /* maxrss is handled by uvm */ > @@ -334,7 +337,7 @@ statclock(struct clockrequest *cr, void > tu->tu_isrss += > (vm->vm_ssize << (PAGE_SHIFT - 10)) * count; > } > - tu_leave(tu); > + tu_leave(tu, gen); > } > > /* > Index: sys/kern/kern_exec.c > =================================================================== > RCS file: /cvs/src/sys/kern/kern_exec.c,v > diff -u -p -r1.262 kern_exec.c > --- sys/kern/kern_exec.c 17 Feb 2025 10:07:10 -0000 1.262 > +++ sys/kern/kern_exec.c 4 May 2025 07:18:11 -0000 > @@ -699,7 +699,7 @@ sys_execve(struct proc *p, void *v, regi > /* reset CPU time usage for the thread, but not the process */ > timespecclear(&p->p_tu.tu_runtime); > p->p_tu.tu_uticks = p->p_tu.tu_sticks = p->p_tu.tu_iticks = 0; > - p->p_tu.tu_gen = 0; > + pc_lock_init(&p->p_tu.tu_pcl); > > memset(p->p_name, 0, sizeof p->p_name); > > Index: sys/kern/kern_lock.c > =================================================================== > RCS file: /cvs/src/sys/kern/kern_lock.c,v > diff -u -p -r1.75 kern_lock.c > --- sys/kern/kern_lock.c 3 Jul 2024 01:36:50 -0000 1.75 > +++ sys/kern/kern_lock.c 4 May 2025 07:18:11 -0000 > @@ -24,6 +24,7 @@ > #include > #include > #include > +#include > > #include > > @@ -418,3 +419,102 @@ _mtx_init_flags(struct mutex *m, int ipl > _mtx_init(m, ipl); > } > #endif /* WITNESS */ > + > +void > +pc_lock_init(struct pc_lock *pcl) > +{ > + pcl->pcl_gen = 0; > +} > + > +unsigned int > +pc_sprod_enter(struct pc_lock *pcl) > +{ > + unsigned int gen; > + > + gen = pcl->pcl_gen; > + pcl->pcl_gen = ++gen; > + membar_producer(); > + > + return (gen); > +} > + > +void > +pc_sprod_leave(struct pc_lock *pcl, unsigned int gen) > +{ > + membar_producer(); > + pcl->pcl_gen = ++gen; > +} > + > +#ifdef MULTIPROCESSOR > +unsigned int > +pc_mprod_enter(struct pc_lock *pcl) > +{ > + unsigned int gen, ngen, ogen; > + > + gen = pcl->pcl_gen; > + for (;;) { > + while (gen & 1) { > + CPU_BUSY_CYCLE(); > + gen = pcl->pcl_gen; > + } > + > + ngen = 1 + gen; > + ogen = atomic_cas_uint(&pcl->pcl_gen, gen, ngen); > + if (gen == ogen) > + break; > + > + CPU_BUSY_CYCLE(); > + gen = ogen; > + } > + > + membar_enter_after_atomic(); > + return (ngen); > +} > + > +void > +pc_mprod_leave(struct pc_lock *pcl, unsigned int gen) > +{ > + membar_exit(); > + pcl->pcl_gen = ++gen; > +} > +#else /* MULTIPROCESSOR */ > +unsigned int pc_mprod_enter(struct pc_lock *) > + __attribute__((alias("pc_sprod_enter"))); > +void pc_mprod_leave(struct pc_lock *, unsigned int) > + __attribute__((alias("pc_sprod_leave"))); > +#endif /* MULTIPROCESSOR */ > + > +void > +pc_cons_enter(struct pc_lock *pcl, unsigned int *genp) > +{ > + unsigned int gen; > + > + gen = pcl->pcl_gen; > + while (gen & 1) { > + CPU_BUSY_CYCLE(); > + gen = pcl->pcl_gen; > + } > + > + membar_consumer(); > + *genp = gen; > +} > + > +int > +pc_cons_leave(struct pc_lock *pcl, unsigned int *genp) > +{ > + unsigned int gen; > + > + membar_consumer(); > + > + gen = pcl->pcl_gen; > + if (gen & 1) { > + do { > + CPU_BUSY_CYCLE(); > + gen = pcl->pcl_gen; > + } while (gen & 1); > + } else if (gen == *genp) > + return (0); > + > + *genp = gen; > + return (EBUSY); > +} > Index: sys/kern/kern_resource.c > =================================================================== > RCS file: /cvs/src/sys/kern/kern_resource.c,v > diff -u -p -r1.94 kern_resource.c > --- sys/kern/kern_resource.c 2 May 2025 05:04:38 -0000 1.94 > +++ sys/kern/kern_resource.c 4 May 2025 07:18:11 -0000 > @@ -63,7 +63,7 @@ struct plimit *lim_copy(struct plimit *) > struct plimit *lim_write_begin(void); > void lim_write_commit(struct plimit *); > > -void tuagg_sumup(struct tusage *, const struct tusage *); > +void tuagg_sumup(struct tusage *, struct tusage *); > > /* > * Patchable maximum data and stack limits. > @@ -369,28 +369,15 @@ sys_getrlimit(struct proc *p, void *v, r > > /* Add the counts from *from to *tu, ensuring a consistent read of *from. */ > void > -tuagg_sumup(struct tusage *tu, const struct tusage *from) > +tuagg_sumup(struct tusage *tu, struct tusage *from) > { > struct tusage tmp; > - uint64_t enter, leave; > + unsigned int gen; > > - enter = from->tu_gen; > - for (;;) { > - /* the generation number is odd during an update */ > - while (enter & 1) { > - CPU_BUSY_CYCLE(); > - enter = from->tu_gen; > - } > - > - membar_consumer(); > + pc_cons_enter(&from->tu_pcl, &gen); > + do { > tmp = *from; > - membar_consumer(); > - leave = from->tu_gen; > - > - if (enter == leave) > - break; > - enter = leave; > - } > + } while (pc_cons_leave(&from->tu_pcl, &gen) != 0); > > tu->tu_uticks += tmp.tu_uticks; > tu->tu_sticks += tmp.tu_sticks; > @@ -433,12 +420,14 @@ tuagg_get_process(struct tusage *tu, str > void > tuagg_add_process(struct process *pr, struct proc *p) > { > + unsigned int gen; > + > MUTEX_ASSERT_LOCKED(&pr->ps_mtx); > KASSERT(curproc == p || p->p_stat == SDEAD); > > - tu_enter(&pr->ps_tu); > + gen = tu_enter(&pr->ps_tu); > tuagg_sumup(&pr->ps_tu, &p->p_tu); > - tu_leave(&pr->ps_tu); > + tu_leave(&pr->ps_tu, gen); > > /* Now reset CPU time usage for the thread. */ > timespecclear(&p->p_tu.tu_runtime); > @@ -452,6 +441,7 @@ tuagg_add_runtime(void) > struct schedstate_percpu *spc = &curcpu()->ci_schedstate; > struct proc *p = curproc; > struct timespec ts, delta; > + unsigned int gen; > > /* > * Compute the amount of time during which the current > @@ -472,9 +462,9 @@ tuagg_add_runtime(void) > } > /* update spc_runtime */ > spc->spc_runtime = ts; > - tu_enter(&p->p_tu); > + gen = tu_enter(&p->p_tu); > timespecadd(&p->p_tu.tu_runtime, &delta, &p->p_tu.tu_runtime); > - tu_leave(&p->p_tu); > + tu_leave(&p->p_tu, gen); > } > > /* > Index: sys/kern/kern_sysctl.c > =================================================================== > RCS file: /cvs/src/sys/kern/kern_sysctl.c,v > diff -u -p -r1.465 kern_sysctl.c > --- sys/kern/kern_sysctl.c 27 Apr 2025 00:58:55 -0000 1.465 > +++ sys/kern/kern_sysctl.c 4 May 2025 07:18:11 -0000 > @@ -172,6 +172,8 @@ int hw_sysctl_locked(int *, u_int, void > > int (*cpu_cpuspeed)(int *); > > +static void sysctl_ci_cp_time(struct cpu_info *, uint64_t *); > + > /* > * Lock to avoid too many processes vslocking a large amount of memory > * at the same time. > @@ -682,11 +684,15 @@ kern_sysctl_locked(int *name, u_int name > memset(cp_time, 0, sizeof(cp_time)); > > CPU_INFO_FOREACH(cii, ci) { > + uint64_t ci_cp_time[CPUSTATES]; > + > if (!cpu_is_online(ci)) > continue; > + > n++; > + sysctl_ci_cp_time(ci, ci_cp_time); > for (i = 0; i < CPUSTATES; i++) > - cp_time[i] += ci->ci_schedstate.spc_cp_time[i]; > + cp_time[i] += ci_cp_time[i]; > } > > for (i = 0; i < CPUSTATES; i++) > @@ -2793,12 +2799,27 @@ sysctl_sensors(int *name, u_int namelen, > } > #endif /* SMALL_KERNEL */ > > +static void > +sysctl_ci_cp_time(struct cpu_info *ci, uint64_t *cp_time) > +{ > + struct schedstate_percpu *spc = &ci->ci_schedstate; > + unsigned int gen; > + > + pc_cons_enter(&spc->spc_cp_time_lock, &gen); > + do { > + int i; > + for (i = 0; i < CPUSTATES; i++) > + cp_time[i] = spc->spc_cp_time[i]; > + } while (pc_cons_leave(&spc->spc_cp_time_lock, &gen) != 0); > +} > + > int > sysctl_cptime2(int *name, u_int namelen, void *oldp, size_t *oldlenp, > void *newp, size_t newlen) > { > CPU_INFO_ITERATOR cii; > struct cpu_info *ci; > + uint64_t cp_time[CPUSTATES]; > int found = 0; > > if (namelen != 1) > @@ -2813,9 +2834,10 @@ sysctl_cptime2(int *name, u_int namelen, > if (!found) > return (ENOENT); > > + sysctl_ci_cp_time(ci, cp_time); > + > return (sysctl_rdstruct(oldp, oldlenp, newp, > - &ci->ci_schedstate.spc_cp_time, > - sizeof(ci->ci_schedstate.spc_cp_time))); > + cp_time, sizeof(cp_time))); > } > > #if NAUDIO > 0 > @@ -2881,7 +2903,7 @@ sysctl_cpustats(int *name, u_int namelen > return (ENOENT); > > memset(&cs, 0, sizeof cs); > - memcpy(&cs.cs_time, &ci->ci_schedstate.spc_cp_time, sizeof(cs.cs_time)); > + sysctl_ci_cp_time(ci, cs.cs_time); > cs.cs_flags = 0; > if (cpu_is_online(ci)) > cs.cs_flags |= CPUSTATS_ONLINE; > Index: sys/kern/sched_bsd.c > =================================================================== > RCS file: /cvs/src/sys/kern/sched_bsd.c,v > diff -u -p -r1.99 sched_bsd.c > --- sys/kern/sched_bsd.c 10 Mar 2025 09:28:56 -0000 1.99 > +++ sys/kern/sched_bsd.c 4 May 2025 07:18:11 -0000 > @@ -585,6 +585,7 @@ setperf_auto(void *v) > CPU_INFO_ITERATOR cii; > struct cpu_info *ci; > uint64_t idle, total, allidle = 0, alltotal = 0; > + unsigned int gen; > > if (!perfpolicy_dynamic()) > return; > @@ -609,14 +610,23 @@ setperf_auto(void *v) > return; > } > CPU_INFO_FOREACH(cii, ci) { > + struct schedstate_percpu *spc; > + > if (!cpu_is_online(ci)) > continue; > - total = 0; > - for (i = 0; i < CPUSTATES; i++) { > - total += ci->ci_schedstate.spc_cp_time[i]; > - } > + > + spc = &ci->ci_schedstate; > + pc_cons_enter(&spc->spc_cp_time_lock, &gen); > + do { > + total = 0; > + for (i = 0; i < CPUSTATES; i++) { > + total += spc->spc_cp_time[i]; > + } > + idle = spc->spc_cp_time[CP_IDLE]; > + } while (pc_cons_leave(&spc->spc_cp_time_lock, &gen) != 0); > + > total -= totalticks[j]; > - idle = ci->ci_schedstate.spc_cp_time[CP_IDLE] - idleticks[j]; > + idle -= idleticks[j]; > if (idle < total / 3) > speedup = 1; > alltotal += total; > Index: sys/sys/pclock.h > =================================================================== > RCS file: sys/sys/pclock.h > diff -N sys/sys/pclock.h > --- /dev/null 1 Jan 1970 00:00:00 -0000 > +++ sys/sys/pclock.h 4 May 2025 07:18:11 -0000 > @@ -0,0 +1,49 @@ > +/* $OpenBSD$ */ > + > +/* > + * Copyright (c) 2023 David Gwynne > + * > + * Permission to use, copy, modify, and distribute this software for any > + * purpose with or without fee is hereby granted, provided that the above > + * copyright notice and this permission notice appear in all copies. > + * > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > + */ > + > +#ifndef _SYS_PCLOCK_H > +#define _SYS_PCLOCK_H > + > +#include > + > +struct pc_lock { > + volatile unsigned int pcl_gen; > +}; > + > +#ifdef _KERNEL > + > +#define PC_LOCK_INITIALIZER() { .pcl_gen = 0 } > + > +void pc_lock_init(struct pc_lock *); > + > +/* single (non-interlocking) producer */ > +unsigned int pc_sprod_enter(struct pc_lock *); > +void pc_sprod_leave(struct pc_lock *, unsigned int); > + > +/* multiple (interlocking) producers */ > +unsigned int pc_mprod_enter(struct pc_lock *); > +void pc_mprod_leave(struct pc_lock *, unsigned int); > + > +/* consumer */ > +void pc_cons_enter(struct pc_lock *, unsigned int *); > +__warn_unused_result int > + pc_cons_leave(struct pc_lock *, unsigned int *); > + > +#endif /* _KERNEL */ > + > +#endif /* _SYS_PCLOCK_H */ > Index: sys/sys/proc.h > =================================================================== > RCS file: /cvs/src/sys/sys/proc.h,v > diff -u -p -r1.387 proc.h > --- sys/sys/proc.h 2 May 2025 05:04:38 -0000 1.387 > +++ sys/sys/proc.h 4 May 2025 07:18:11 -0000 > @@ -51,6 +51,7 @@ > #include /* For struct rwlock */ > #include /* For struct sigio */ > #include /* For struct refcnt */ > +#include > > #ifdef _KERNEL > #include > @@ -91,8 +92,8 @@ struct pgrp { > * Each thread is immediately accumulated here. For processes only the > * time of exited threads is accumulated and to get the proper process > * time usage tuagg_get_process() needs to be called. > - * Accounting of threads is done lockless by curproc using the tu_gen > - * generation counter. Code should use tu_enter() and tu_leave() for this. > + * Accounting of threads is done lockless by curproc using the tu_pcl > + * pc_lock. Code should use tu_enter() and tu_leave() for this. > * The process ps_tu structure is locked by the ps_mtx. > */ > #define TU_UTICKS 0 /* Statclock hits in user mode. */ > @@ -101,7 +102,7 @@ struct pgrp { > #define TU_TICKS_COUNT 3 > > struct tusage { > - uint64_t tu_gen; /* generation counter */ > + struct pc_lock tu_pcl; > uint64_t tu_ticks[TU_TICKS_COUNT]; > #define tu_uticks tu_ticks[TU_UTICKS] > #define tu_sticks tu_ticks[TU_STICKS] > @@ -125,8 +126,6 @@ struct tusage { > * run-time information needed by threads. > */ > #ifdef __need_process > -struct futex; > -LIST_HEAD(futex_list, futex); > struct proc; > struct tslpentry; > TAILQ_HEAD(tslpqueue, tslpentry); > @@ -187,7 +186,6 @@ struct process { > struct vmspace *ps_vmspace; /* Address space */ > pid_t ps_pid; /* [I] Process identifier. */ > > - struct futex_list ps_ftlist; /* futexes attached to this process */ > struct tslpqueue ps_tslpqueue; /* [p] queue of threads in thrsleep */ > struct rwlock ps_lock; /* per-process rwlock */ > struct mutex ps_mtx; /* per-process mutex */ > @@ -353,9 +351,6 @@ struct proc { > struct process *p_p; /* [I] The process of this thread. */ > TAILQ_ENTRY(proc) p_thr_link; /* [K|m] Threads in a process linkage. */ > > - TAILQ_ENTRY(proc) p_fut_link; /* Threads in a futex linkage. */ > - struct futex *p_futex; /* Current sleeping futex. */ > - > /* substructures: */ > struct filedesc *p_fd; /* copy of p_p->ps_fd */ > struct vmspace *p_vmspace; /* [I] copy of p_p->ps_vmspace */ > @@ -655,18 +650,16 @@ void cpuset_complement(struct cpuset *, > int cpuset_cardinality(struct cpuset *); > struct cpu_info *cpuset_first(struct cpuset *); > > -static inline void > +static inline unsigned int > tu_enter(struct tusage *tu) > { > - ++tu->tu_gen; /* make the generation number odd */ > - membar_producer(); > + return pc_sprod_enter(&tu->tu_pcl); > } > > static inline void > -tu_leave(struct tusage *tu) > +tu_leave(struct tusage *tu, unsigned int gen) > { > - membar_producer(); > - ++tu->tu_gen; /* make the generation number even again */ > + pc_sprod_leave(&tu->tu_pcl, gen); > } > > #endif /* _KERNEL */ > Index: sys/sys/sched.h > =================================================================== > RCS file: /cvs/src/sys/sys/sched.h,v > diff -u -p -r1.73 sched.h > --- sys/sys/sched.h 8 Jul 2024 14:46:47 -0000 1.73 > +++ sys/sys/sched.h 4 May 2025 07:18:11 -0000 > @@ -97,6 +97,7 @@ struct cpustats { > > #include > #include > +#include > > #define SCHED_NQS 32 /* 32 run queues. */ > > @@ -112,6 +113,7 @@ struct schedstate_percpu { > struct timespec spc_runtime; /* time curproc started running */ > volatile int spc_schedflags; /* flags; see below */ > u_int spc_schedticks; /* ticks for schedclock() */ > + struct pc_lock spc_cp_time_lock; > u_int64_t spc_cp_time[CPUSTATES]; /* CPU state statistics */ > u_char spc_curpriority; /* usrpri of curproc */ > >