From: Visa Hankala Subject: Re: octeon: interrupts and barier cleanup To: "Kirill A. Korinsky" Cc: tech@openbsd.org Date: Fri, 3 Apr 2026 13:47:24 +0000 On Thu, Apr 02, 2026 at 11:44:03PM +0200, Kirill A. Korinsky wrote: > On Thu, 02 Apr 2026 15:48:01 +0200, > Visa Hankala wrote: > > > > On Wed, Apr 01, 2026 at 09:44:40PM +0200, Kirill A. Korinsky wrote: > > > Visa, tech@, > > > > > > in my work to bring support queues in cnmac I think I discovered a weird > > > issue in interrupts and barrier in octeon. > > > > > > octciu_intr_establish() has: > > > > > > 201 int cpuid = cpu_number(); > > > ... > > > 208 #ifdef MULTIPROCESSOR > > > 209 /* Span work queue interrupts across CPUs. */ > > > 210 if (IS_WORKQ_IRQ(irq)) > > > 211 cpuid = irq % ncpus; > > > 212 #endif > > > ... > > > 222 ih->ih_irq = irq; > > > > > > my understanding is that octciu_intr_establish() happens on autoconf where > > > only CPU0 exists, and on octeon we increase ncpus in cpu hatch, and not > > > attach, which means that ncpus is 1 at octciu_intr_establish() time. > > > > > > What makes cpuid always 1 for that code and schedules all interrupts on > > > CPU0. > > > > Hmm, the code used ncpusfound originally. But it was later changed to > > use ncpus, which apparently broke the logic. > > > > Thanks for the lead. This reads like regression which I see with kn@'s diff > https://github.com/openbsd/src/commit/fee69528d2cd998356813db0922e7fd576485f9c > > > > Next, octciu_intr_barrier(): > > > > > > 290 void > > > 291 octciu_intr_barrier(void *_ih) > > > 292 { > > > 293 struct cpu_info *ci = NULL; > > > 294 #ifdef MULTIPROCESSOR > > > 295 struct octciu_intrhand *ih = _ih; > > > > > > 296 if (IS_WORKQ_IRQ(ih->ih_irq)) > > > 297 ci = get_cpu_info(ih->ih_irq % ncpus); > > > 298 #endif > > > > > > 299 sched_barrier(ci); > > > 300 } > > > > > > because ih->ih_irq is 0 here, it picks cpu0 only. > > > > > > kn@'s diff https://marc.info/?l=openbsd-bugs&m=177171071006897&w=2 > > > introduced a regression on my ER-4: > > > https://marc.info/?l=openbsd-bugs&m=177177669428825&w=2 and this regression > > > can be avoided by removing the block in octciu_intr_establish(). > > > > > > Here is the diff where I remove this block and % ncpus. > > > > > > This diff keeps interrupts on CPU0 which is, actually, current behaviour and > > > unblocks kn@'s diff to enable softnet threads on octeon which unblocks my > > > work of making cnmac with RX queue: > > > https://marc.info/?l=openbsd-tech&m=177506394121597&w=2 > > > > > > So far I see this diff as safe because it doesn't change current behaviour > > > but I may miss some details. > > > > > > OK to commit it? > > > > I suggest you surround the #ifdef MULTIPROCESSOR blocks in > > octciu_intr_establish() and octciu_intr_barrier() with #if 0, because > > you will probably need them later for distributing the POW group > > interrupts and testing your multi-queue work. > > > > I still not completley sure that the second block in octciu_intr_barrier() > need to be removed. > > In my full diff, which I send into tech@ as cumulative one I don't touch > octciu_intr_barrier(), and I run it on my home's gw and it works well, > stable, and I have near ~1gbps between vlans on the same cnmac. > > So, here a diff which I think is right, which shouldn't change current > behaviour, and which avoid regression from kn@'s difs and on my gw with two > diffs I see multiple softnet threads and don't see the regression. > > Ok? OK visa@ > Index: sys/arch/octeon/dev/octciu.c > =================================================================== > RCS file: /home/cvs/src/sys/arch/octeon/dev/octciu.c,v > diff -u -p -r1.19 octciu.c > --- sys/arch/octeon/dev/octciu.c 11 Dec 2022 05:31:05 -0000 1.19 > +++ sys/arch/octeon/dev/octciu.c 2 Apr 2026 21:27:14 -0000 > @@ -250,11 +250,13 @@ octciu_intr_establish(int irq, int level > panic("%s: illegal irq %d", __func__, irq); > #endif > > +#if 0 > #ifdef MULTIPROCESSOR > /* Span work queue interrupts across CPUs. */ > if (IS_WORKQ_IRQ(irq)) > cpuid = irq % ncpus; > #endif > +#endif > > flags = (level & IPL_MPSAFE) ? CIH_MPSAFE : 0; > level &= ~IPL_MPSAFE; > @@ -363,7 +365,7 @@ octciu_intr_barrier(void *_ih) > struct octciu_intrhand *ih = _ih; > > if (IS_WORKQ_IRQ(ih->ih_irq)) > - ci = get_cpu_info(ih->ih_irq % ncpus); > + ci = get_cpu_info(ih->ih_cpuid); > #endif > > sched_barrier(ci); > > > -- > wbr, Kirill >