From: Scott Cheloha Subject: Re: dt(4): interval/profile: schedule clockintr relative to start of recording To: tech@openbsd.org Cc: claudio@openbsd.org, mpi@openbsd.org Date: Sat, 24 Feb 2024 16:01:28 -0600 On Sat, Feb 24, 2024 at 10:30:48PM +0100, Claudio Jeker wrote: > On Sat, Feb 24, 2024 at 10:03:27AM -0600, Scott Cheloha wrote: > > To align btrace(8)'s behavior with bpftrace we need to schedule the > > interval/profile clock interrupts relative to the start of recording, > > not relative to the start of the uptime clock. > > > > The problem is obvious when you compare behavior. > > > > bpftrace: > > > > alpine:~# uname -a > > Linux alpine 6.6.16-0-lts #1-Alpine SMP PREEMPT_DYNAMIC Wed, 07 Feb 2024 18:00:38 +0000 x86_64 Linux > > alpine:~# bpftrace --version > > bpftrace v0.19.1 > > alpine:~# cat interval-start-latency.bt > > BEGIN { > > @t0 = nsecs; > > } > > > > interval:hz:1 { > > $dt = nsecs - @t0; > > printf("elapsed %d.%09d\n", $dt / 1000000000, $dt % 1000000000); > > exit(); > > } > > alpine:~# bpftrace -q interval-start-latency.bt | head -n 1 > > elapsed 1.014731440 > > alpine:~# cat profile-start-latency.bt > > BEGIN { > > @t0 = nsecs; > > } > > > > profile:hz:1 { > > $dt = nsecs - @t0; > > printf("elapsed %d.%09d\n", $dt / 1000000000, $dt % 1000000000); > > exit(); > > } > > alpine:~# bpftrace -q profile-start-latency.bt | head -n 1 > > elapsed 1.023011922 > > > > btrace(8): > > > > $ doas btrace interval-start-latency.bt > > elapsed 0.586779311 > > $ doas btrace profile-start-latency.bt > > elapsed 0.252215375 > > > > Fix attached. ok? > > btrace is not bpftrace. I think in most cases we should not try to emulate > bpftrace since that will go poorly. I thought one of the goals of btrace(8) *was* to emulate bpftrace. At least, I thought that was a goal when doing so wasn't difficult or overly complicated. Even if that isn't the case (?), in this particular instance I really think the bpftrace behavior is better than ours. When I run a periodic probe, I want the execution period to be consistent because that makes it simpler to compare two intervals. Example: suppose you're printing a summary of whatever you're measuring once per second. You probably want the measurement interval to be the same for each summary, i.e. you want the first summary to print 1 second after you start measuring, and then the next summary to print 1 second after that, and so on and so forth. This patch makes btrace(8) do exactly that. Scheduling the clock interrupt relative to the moment we start recording also yields more predictable behavior across different runs of a script. Right now, the latency between BEGIN and the first time a periodic probe fires is essentially random each time you run. Index: dt_dev.c =================================================================== RCS file: /cvs/src/sys/dev/dt/dt_dev.c,v diff -u -p -r1.31 dt_dev.c --- dt_dev.c 18 Feb 2024 00:54:03 -0000 1.31 +++ dt_dev.c 24 Feb 2024 15:59:17 -0000 @@ -477,7 +477,9 @@ dt_ioctl_get_stats(struct dt_softc *sc, int dt_ioctl_record_start(struct dt_softc *sc) { + uint64_t now; struct dt_pcb *dp; + int need_time = 1; if (sc->ds_recording) return EBUSY; @@ -495,9 +497,14 @@ dt_ioctl_record_start(struct dt_softc *s dtp->dtp_prov->dtpv_recording++; if (dp->dp_nsecs != 0) { + if (need_time) { + need_time = 0; + now = nsecuptime(); + } clockintr_bind(&dp->dp_clockintr, dp->dp_cpu, dt_clock, dp); - clockintr_advance(&dp->dp_clockintr, dp->dp_nsecs); + clockintr_schedule(&dp->dp_clockintr, + now + dp->dp_nsecs); } } rw_exit_write(&dt_lock);