From: Claudio Jeker Subject: Re: dt(4): interval/profile: schedule clockintr relative to start of recording To: Scott Cheloha Cc: tech@openbsd.org, mpi@openbsd.org Date: Wed, 28 Feb 2024 08:59:56 +0100 On Tue, Feb 27, 2024 at 01:59:14PM -0600, Scott Cheloha wrote: > On Mon, Feb 26, 2024 at 12:07:50PM +0100, Claudio Jeker wrote: > > On Sat, Feb 24, 2024 at 04:01:28PM -0600, Scott Cheloha wrote: > > > On Sat, Feb 24, 2024 at 10:30:48PM +0100, Claudio Jeker wrote: > > > > On Sat, Feb 24, 2024 at 10:03:27AM -0600, Scott Cheloha wrote: > > > > > To align btrace(8)'s behavior with bpftrace we need to schedule the > > > > > interval/profile clock interrupts relative to the start of recording, > > > > > not relative to the start of the uptime clock. > > > > > > > > > > The problem is obvious when you compare behavior. > > > > > > > > > > bpftrace: > > > > > > > > > > alpine:~# uname -a > > > > > Linux alpine 6.6.16-0-lts #1-Alpine SMP PREEMPT_DYNAMIC Wed, 07 Feb 2024 18:00:38 +0000 x86_64 Linux > > > > > alpine:~# bpftrace --version > > > > > bpftrace v0.19.1 > > > > > alpine:~# cat interval-start-latency.bt > > > > > BEGIN { > > > > > @t0 = nsecs; > > > > > } > > > > > > > > > > interval:hz:1 { > > > > > $dt = nsecs - @t0; > > > > > printf("elapsed %d.%09d\n", $dt / 1000000000, $dt % 1000000000); > > > > > exit(); > > > > > } > > > > > alpine:~# bpftrace -q interval-start-latency.bt | head -n 1 > > > > > elapsed 1.014731440 > > > > > alpine:~# cat profile-start-latency.bt > > > > > BEGIN { > > > > > @t0 = nsecs; > > > > > } > > > > > > > > > > profile:hz:1 { > > > > > $dt = nsecs - @t0; > > > > > printf("elapsed %d.%09d\n", $dt / 1000000000, $dt % 1000000000); > > > > > exit(); > > > > > } > > > > > alpine:~# bpftrace -q profile-start-latency.bt | head -n 1 > > > > > elapsed 1.023011922 > > > > > > > > > > btrace(8): > > > > > > > > > > $ doas btrace interval-start-latency.bt > > > > > elapsed 0.586779311 > > > > > $ doas btrace profile-start-latency.bt > > > > > elapsed 0.252215375 > > > > > > > > > > Fix attached. ok? > > > > > > > > btrace is not bpftrace. I think in most cases we should not try to emulate > > > > bpftrace since that will go poorly. > > > > > > I thought one of the goals of btrace(8) *was* to emulate bpftrace. At > > > least, I thought that was a goal when doing so wasn't difficult or > > > overly complicated. > > > > While btrace(8) is emulating bpftrace it is a different tool and we don't > > need to be feature and bug compatible. That's all I want to point out. > > Okay, I think we are on the same page. > > > > Even if that isn't the case (?), in this particular instance I really > > > think the bpftrace behavior is better than ours. > > > > > > When I run a periodic probe, I want the execution period to be > > > consistent because that makes it simpler to compare two intervals. > > > > > > Example: suppose you're printing a summary of whatever you're > > > measuring once per second. You probably want the measurement interval > > > to be the same for each summary, i.e. you want the first summary to > > > print 1 second after you start measuring, and then the next summary to > > > print 1 second after that, and so on and so forth. This patch makes > > > btrace(8) do exactly that. > > > > Yes, this makes sense. We want to start recording all probes at the same > > time and then they should fire at their rate. I agree that this behaviour > > is less surprising. > > > > > Scheduling the clock interrupt relative to the moment we start > > > recording also yields more predictable behavior across different runs > > > of a script. Right now, the latency between BEGIN and the first time > > > a periodic probe fires is essentially random each time you run. > > > > Could we simplify this diff please? I see no reason why now is set > > conditionally. See below. > > > > > Index: dt_dev.c > > > =================================================================== > > > RCS file: /cvs/src/sys/dev/dt/dt_dev.c,v > > > diff -u -p -r1.31 dt_dev.c > > > --- dt_dev.c 18 Feb 2024 00:54:03 -0000 1.31 > > > +++ dt_dev.c 24 Feb 2024 15:59:17 -0000 > > > @@ -477,7 +477,9 @@ dt_ioctl_get_stats(struct dt_softc *sc, > > > int > > > dt_ioctl_record_start(struct dt_softc *sc) > > > { > > > + uint64_t now; > > > struct dt_pcb *dp; > > > + int need_time = 1; > > > > > > if (sc->ds_recording) > > > return EBUSY; > > > @@ -495,9 +497,14 @@ dt_ioctl_record_start(struct dt_softc *s > > > dtp->dtp_prov->dtpv_recording++; > > > > > > if (dp->dp_nsecs != 0) { > > > + if (need_time) { > > > + need_time = 0; > > > + now = nsecuptime(); > > > + } > > > > Instead of adding need_time can't we just call now = nsecuptime(); between > > rw_enter_write(&dt_lock); > > and > > TAILQ_FOREACH(dp, &sc->ds_pcbs, dp_snext) { > > > > dt_ioctl_record_start() is not called that frequently that an extra > > nsecuptime() would be noticed. > > Yeah, I guess I'm being a little too fussy. Let's go with that. > > ok? > > Index: dt_dev.c > =================================================================== > RCS file: /cvs/src/sys/dev/dt/dt_dev.c,v > diff -u -p -r1.31 dt_dev.c > --- dt_dev.c 18 Feb 2024 00:54:03 -0000 1.31 > +++ dt_dev.c 27 Feb 2024 19:58:30 -0000 > @@ -477,6 +477,7 @@ dt_ioctl_get_stats(struct dt_softc *sc, > int > dt_ioctl_record_start(struct dt_softc *sc) > { > + uint64_t now; > struct dt_pcb *dp; > > if (sc->ds_recording) > @@ -487,6 +488,7 @@ dt_ioctl_record_start(struct dt_softc *s > return ENOENT; > > rw_enter_write(&dt_lock); > + now = nsecuptime(); > TAILQ_FOREACH(dp, &sc->ds_pcbs, dp_snext) { > struct dt_probe *dtp = dp->dp_dtp; > > @@ -497,7 +499,8 @@ dt_ioctl_record_start(struct dt_softc *s > if (dp->dp_nsecs != 0) { > clockintr_bind(&dp->dp_clockintr, dp->dp_cpu, dt_clock, > dp); > - clockintr_advance(&dp->dp_clockintr, dp->dp_nsecs); > + clockintr_schedule(&dp->dp_clockintr, > + now + dp->dp_nsecs); > } > } > rw_exit_write(&dt_lock); OK claudio@ -- :wq Claudio