From: Scott Cheloha Subject: Re: dt(4): profile: don't stagger clock interrupts To: tech@openbsd.org Date: Fri, 16 Feb 2024 10:25:23 -0600 On Fri, Feb 09, 2024 at 02:22:48PM -0600, Scott Cheloha wrote: > Now that the profile probe is separated from the hardclock() we can > start improving it. > > The simplest thing we can do to reduce profiling overhead is to get > rid of clock interrupt staggering. It's an artifact of the > hardclock(). The problem is intuitive: on average, reading N > profiling events during a single wakeup is cheaper than reading a > single profiling event across N separate wakeups. > > Two "gotchas" to take note of: > > 1. The event buffer in btrace(8) is fixed-size. On machines with > lots of CPUs there may not be enough room to grab all the > profiling events in one read(2). > > 2. There is a hotspot in dt_pcb_ring_consume() where every > CPU on the system will try to enter ds_mtx simultaneously > to increment ds_evtcnt. > > Both can be fixed separately. Plus, the overhead of the mutex > contention in (2) is miniscule compared to the overhead of the extra > wakeups under the current scheme. > > This can wait a few days, just in case we need to back out the recent > dt(4) changes. Ping. Index: dt_dev.c =================================================================== RCS file: /cvs/src/sys/dev/dt/dt_dev.c,v diff -u -p -r1.30 dt_dev.c --- dt_dev.c 9 Feb 2024 17:42:18 -0000 1.30 +++ dt_dev.c 9 Feb 2024 20:06:00 -0000 @@ -497,8 +497,6 @@ dt_ioctl_record_start(struct dt_softc *s if (dp->dp_nsecs != 0) { clockintr_bind(&dp->dp_clockintr, dp->dp_cpu, dt_clock, dp); - clockintr_stagger(&dp->dp_clockintr, dp->dp_nsecs, - CPU_INFO_UNIT(dp->dp_cpu), MAXCPUS); clockintr_advance(&dp->dp_clockintr, dp->dp_nsecs); } }