Index | Thread | Search

From:
Visa Hankala <visa@hankala.org>
Subject:
Re: octeon: interrupts and barier cleanup
To:
"Kirill A. Korinsky" <kirill@korins.ky>
Cc:
tech@openbsd.org
Date:
Fri, 3 Apr 2026 13:47:24 +0000

Download raw body.

Thread
On Thu, Apr 02, 2026 at 11:44:03PM +0200, Kirill A. Korinsky wrote:
> On Thu, 02 Apr 2026 15:48:01 +0200,
> Visa Hankala <visa@hankala.org> wrote:
> > 
> > On Wed, Apr 01, 2026 at 09:44:40PM +0200, Kirill A. Korinsky wrote:
> > > Visa, tech@,
> > > 
> > > in my work to bring support queues in cnmac I think I discovered a weird
> > > issue in interrupts and barrier in octeon.
> > > 
> > > octciu_intr_establish() has:
> > > 
> > >    201          int cpuid = cpu_number();
> > > ...
> > >    208  #ifdef MULTIPROCESSOR
> > >    209          /* Span work queue interrupts across CPUs. */
> > >    210          if (IS_WORKQ_IRQ(irq))
> > >    211                  cpuid = irq % ncpus;
> > >    212  #endif
> > > ...
> > >    222          ih->ih_irq = irq;
> > > 
> > > my understanding is that octciu_intr_establish() happens on autoconf where
> > > only CPU0 exists, and on octeon we increase ncpus in cpu hatch, and not
> > > attach, which means that ncpus is 1 at octciu_intr_establish() time.
> > > 
> > > What makes cpuid always 1 for that code and schedules all interrupts on
> > > CPU0.
> > 
> > Hmm, the code used ncpusfound originally. But it was later changed to
> > use ncpus, which apparently broke the logic.
> >
> 
> Thanks for the lead. This reads like regression which I see with kn@'s diff
> https://github.com/openbsd/src/commit/fee69528d2cd998356813db0922e7fd576485f9c
> 
> > > Next, octciu_intr_barrier():
> > > 
> > >    290  void
> > >    291  octciu_intr_barrier(void *_ih)
> > >    292  {
> > >    293          struct cpu_info *ci = NULL;
> > >    294  #ifdef MULTIPROCESSOR
> > >    295          struct octciu_intrhand *ih = _ih;
> > >       
> > >    296          if (IS_WORKQ_IRQ(ih->ih_irq))
> > >    297                  ci = get_cpu_info(ih->ih_irq % ncpus);
> > >    298  #endif
> > >       
> > >    299          sched_barrier(ci);
> > >    300  }
> > > 
> > > because ih->ih_irq is 0 here, it picks cpu0 only.
> > > 
> > > kn@'s diff https://marc.info/?l=openbsd-bugs&m=177171071006897&w=2
> > > introduced a regression on my ER-4:
> > > https://marc.info/?l=openbsd-bugs&m=177177669428825&w=2 and this regression
> > > can be avoided by removing the block in octciu_intr_establish().
> > > 
> > > Here is the diff where I remove this block and % ncpus.
> > > 
> > > This diff keeps interrupts on CPU0 which is, actually, current behaviour and
> > > unblocks kn@'s diff to enable softnet threads on octeon which unblocks my
> > > work of making cnmac with RX queue:
> > > https://marc.info/?l=openbsd-tech&m=177506394121597&w=2
> > > 
> > > So far I see this diff as safe because it doesn't change current behaviour
> > > but I may miss some details.
> > > 
> > > OK to commit it?
> > 
> > I suggest you surround the #ifdef MULTIPROCESSOR blocks in
> > octciu_intr_establish() and octciu_intr_barrier() with #if 0, because
> > you will probably need them later for distributing the POW group
> > interrupts and testing your multi-queue work.
> > 
> 
> I still not completley sure that the second block in octciu_intr_barrier()
> need to be removed.
> 
> In my full diff, which I send into tech@ as cumulative one I don't touch
> octciu_intr_barrier(), and I run it on my home's gw and it works well,
> stable, and I have near ~1gbps between vlans on the same cnmac.
> 
> So, here a diff which I think is right, which shouldn't change current
> behaviour, and which avoid regression from kn@'s difs and on my gw with two
> diffs I see multiple softnet threads and don't see the regression.
> 
> Ok?

OK visa@

> Index: sys/arch/octeon/dev/octciu.c
> ===================================================================
> RCS file: /home/cvs/src/sys/arch/octeon/dev/octciu.c,v
> diff -u -p -r1.19 octciu.c
> --- sys/arch/octeon/dev/octciu.c	11 Dec 2022 05:31:05 -0000	1.19
> +++ sys/arch/octeon/dev/octciu.c	2 Apr 2026 21:27:14 -0000
> @@ -250,11 +250,13 @@ octciu_intr_establish(int irq, int level
>  		panic("%s: illegal irq %d", __func__, irq);
>  #endif
>  
> +#if 0
>  #ifdef MULTIPROCESSOR
>  	/* Span work queue interrupts across CPUs. */
>  	if (IS_WORKQ_IRQ(irq))
>  		cpuid = irq % ncpus;
>  #endif
> +#endif
>  
>  	flags = (level & IPL_MPSAFE) ? CIH_MPSAFE : 0;
>  	level &= ~IPL_MPSAFE;
> @@ -363,7 +365,7 @@ octciu_intr_barrier(void *_ih)
>  	struct octciu_intrhand *ih = _ih;
>  
>  	if (IS_WORKQ_IRQ(ih->ih_irq))
> -		ci = get_cpu_info(ih->ih_irq % ncpus);
> +		ci = get_cpu_info(ih->ih_cpuid);
>  #endif
>  
>  	sched_barrier(ci);
> 
> 
> -- 
> wbr, Kirill
>