Index | Thread | Search

From:
Mark Kettenis <mark.kettenis@xs4all.nl>
Subject:
Re: spread 8 network interrupt over cpu and softnet
To:
Alexander Bluhm <bluhm@openbsd.org>
Cc:
tech@openbsd.org
Date:
Sun, 05 Oct 2025 18:32:31 +0200

Download raw body.

Thread
> Date: Sat, 4 Oct 2025 21:55:46 +0200
> From: Alexander Bluhm <bluhm@openbsd.org>
> 
> On Fri, Oct 03, 2025 at 04:27:33PM +0200, Mark Kettenis wrote:
> > > Date: Sat, 27 Sep 2025 18:36:50 +0200
> > > From: Alexander Bluhm <bluhm@openbsd.org>
> > > 
> > > Hi,
> > > 
> > > Currently most network drivers use 8 queues and distribute interrupts.
> > > ix(4) is the only one that may allocate more than 8.  ice(8) uses
> > > only 8 vectors, but allocates more interrupts.
> > > 
> > > I would like to limit interrupts and queues to 8 for all drivers.
> > > This prevents running out of interrupt vectors so easily.
> > 
> > Here's a nickel boy; go buy yourself a better computer!
> > 
> > Joking aside, we shouldn't waste resources and I it is plausible that
> > using the maximum number of queues provided by the hardware isn'
> > optimal.
> 
> Underlying problem is that amd64 installes interrupt vectors on all
> CPUs although it is only delivered to one CPU.  After we have fixed
> that, we can reconsider the IF_MAX_VECTORS limit.  But even then
> it makes sense to have some common limit for all drivers.  With
> that we can find an optimal value later.

Sure.

> > > Currently we have 8 softnet threads which limits parallel processing
> > > anyway.  We can tune these values later.
> > 
> > Does it even make sense to have more rx queues than softnet threads?
> > Should there be a single limit for both?
> 
> Number of softnet threads, number of interupts, and number of CPUs
> influence each other.  But without proper testing, it is hard to
> say what the performance impact is.  For that I want independent
> limits to make experiments on multple CPU machines.  Than we can
> find values that are suitable for common use cases.
> 
> I need distinct constants for my experiments.

Sure.

> > > With this diff, forwarding througput increases from 45 to 50 GBit/sec
> > > on my 12 core machine from ice0 to ice1.  I am using iperf3 TCP on
> > > Linux to measure it.  I think slower performance happend because
> > > ice(4) was allocating more interrupts than it has vectors.  Now
> > > everything is limited to 8.  For other drivers I see no difference
> > > as they operate at line speed anyway.
> > > 
> > > I think arbitrary numbers IXL_MAX_VECTORS, IGC_MAX_VECTORS and
> > > IGC_MAX_VECTORS could go away, but that would be per driver diffs.
> > > 
> > > Second part of the diff spreads the interrupts of network devices
> > > equally over the softnet tasks.
> > 
> > That is this line:
> > 
> > > -	ifq->ifq_softnet = net_tq(idx);
> > > +	ifq->ifq_softnet = net_tq(ifp->if_index * IF_MAX_VECTORS + idx);
> > 
> > Does this even do anything given that both IF_MAX_VECTOR and the
> > number of softneq threads is 8?  Doesn't this simply reduce to:
> > 
> >     (ifp->if_index * 8 + idx) % 8 = idx
> 
> Currently it does nothing, the 8 is eliminated.  But I want to see
> what happens with 16 softnets.  And I need an easy way to test that.

Well, you can just change that line to whatever you want to test in
the kernel you're running.

> > Maybe it is better to stagger the interrupts instead?  Like
> > 
> >     ifq->ifq_softnet = net_tq(ifp->if_index + idx);
> 
> That code was there before, but dlg@ removed it.  As we want to
> distribute traffic with toeplitz hash, adding interface index is
> wrong.  My idea is to use toeplitz over 8 softnets queues.  But if
> we have more threads, distribute the next interface over the remaining
> threads.

But you didn't test that yet...

> Currently both are 8 and this does not matter.  But I want
> infrastructure to play with these numbers.  If it does not work,
> we can adjust the algorithm later.

Yes, you can.  So why do you do it now?

So, I'm fine with the IF_MAX_VECTORS bits going in, but I don't see
the point of the net/ifq.c change going in at this point.

> > > Index: dev/pci/if_bnxt.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_bnxt.c,v
> > > diff -u -p -r1.56 if_bnxt.c
> > > --- dev/pci/if_bnxt.c	5 Sep 2025 09:58:24 -0000	1.56
> > > +++ dev/pci/if_bnxt.c	25 Sep 2025 08:27:45 -0000
> > > @@ -545,9 +545,11 @@ bnxt_attach(struct device *parent, struc
> > >  		nmsix = pci_intr_msix_count(pa);
> > >  		if (nmsix > 1) {
> > >  			sc->sc_ih = pci_intr_establish(sc->sc_pc, ih,
> > > -			    IPL_NET | IPL_MPSAFE, bnxt_admin_intr, sc, DEVNAME(sc));
> > > -			sc->sc_intrmap = intrmap_create(&sc->sc_dev,
> > > -			    nmsix - 1, BNXT_MAX_QUEUES, INTRMAP_POWEROF2);
> > > +			    IPL_NET | IPL_MPSAFE, bnxt_admin_intr, sc,
> > > +			    DEVNAME(sc));
> > > +			sc->sc_intrmap = intrmap_create(&sc->sc_dev, nmsix - 1,
> > > +			    MIN(BNXT_MAX_QUEUES, IF_MAX_VECTORS),
> > > +			    INTRMAP_POWEROF2);
> > >  			sc->sc_nqueues = intrmap_count(sc->sc_intrmap);
> > >  			KASSERT(sc->sc_nqueues > 0);
> > >  			KASSERT(powerof2(sc->sc_nqueues));
> > > Index: dev/pci/if_ice.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_ice.c,v
> > > diff -u -p -r1.59 if_ice.c
> > > --- dev/pci/if_ice.c	17 Sep 2025 12:54:19 -0000	1.59
> > > +++ dev/pci/if_ice.c	27 Sep 2025 07:35:25 -0000
> > > @@ -30501,7 +30501,8 @@ ice_attach_hook(struct device *self)
> > >  	sc->sc_nmsix = nmsix;
> > >  	nqueues_max = MIN(sc->isc_nrxqsets_max, sc->isc_ntxqsets_max);
> > >  	sc->sc_intrmap = intrmap_create(&sc->sc_dev, sc->sc_nmsix - 1,
> > > -	    nqueues_max, INTRMAP_POWEROF2);
> > > +	    MIN(MIN(nqueues_max, ICE_MAX_VECTORS), IF_MAX_VECTORS),
> > > +	    INTRMAP_POWEROF2);
> > >  	nqueues = intrmap_count(sc->sc_intrmap);
> > >  	KASSERT(nqueues > 0);
> > >  	KASSERT(powerof2(nqueues));
> > > Index: dev/pci/if_igc.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_igc.c,v
> > > diff -u -p -r1.28 if_igc.c
> > > --- dev/pci/if_igc.c	24 Jun 2025 11:00:27 -0000	1.28
> > > +++ dev/pci/if_igc.c	27 Sep 2025 07:29:57 -0000
> > > @@ -724,8 +724,8 @@ igc_setup_msix(struct igc_softc *sc)
> > >  	/* Give one vector to events. */
> > >  	nmsix--;
> > >  
> > > -	sc->sc_intrmap = intrmap_create(&sc->sc_dev, nmsix, IGC_MAX_VECTORS,
> > > -	    INTRMAP_POWEROF2);
> > > +	sc->sc_intrmap = intrmap_create(&sc->sc_dev, nmsix,
> > > +	    MIN(IGC_MAX_VECTORS, IF_MAX_VECTORS), INTRMAP_POWEROF2);
> > >  	sc->sc_nqueues = intrmap_count(sc->sc_intrmap);
> > >  }
> > >  
> > > Index: dev/pci/if_ix.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_ix.c,v
> > > diff -u -p -r1.221 if_ix.c
> > > --- dev/pci/if_ix.c	24 Jun 2025 11:02:03 -0000	1.221
> > > +++ dev/pci/if_ix.c	25 Sep 2025 08:27:45 -0000
> > > @@ -1854,7 +1854,8 @@ ixgbe_setup_msix(struct ix_softc *sc)
> > >  	/* XXX the number of queues is limited to what we can keep stats on */
> > >  	maxq = (sc->hw.mac.type == ixgbe_mac_82598EB) ? 8 : 16;
> > >  
> > > -	sc->sc_intrmap = intrmap_create(&sc->dev, nmsix, maxq, 0);
> > > +	sc->sc_intrmap = intrmap_create(&sc->dev, nmsix,
> > > +	    MIN(maxq, IF_MAX_VECTORS), 0);
> > >  	sc->num_queues = intrmap_count(sc->sc_intrmap);
> > >  }
> > >  
> > > Index: dev/pci/if_ixl.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_ixl.c,v
> > > diff -u -p -r1.109 if_ixl.c
> > > --- dev/pci/if_ixl.c	17 Sep 2025 12:54:19 -0000	1.109
> > > +++ dev/pci/if_ixl.c	27 Sep 2025 07:29:57 -0000
> > > @@ -1782,8 +1782,9 @@ ixl_attach(struct device *parent, struct
> > >  		if (nmsix > 1) { /* we used 1 (the 0th) for the adminq */
> > >  			nmsix--;
> > >  
> > > -			sc->sc_intrmap = intrmap_create(&sc->sc_dev,
> > > -			    nmsix, IXL_MAX_VECTORS, INTRMAP_POWEROF2);
> > > +			sc->sc_intrmap = intrmap_create(&sc->sc_dev, nmsix,
> > > +			    MIN(IXL_MAX_VECTORS, IF_MAX_VECTORS),
> > > +			    INTRMAP_POWEROF2);
> > >  			nqueues = intrmap_count(sc->sc_intrmap);
> > >  			KASSERT(nqueues > 0);
> > >  			KASSERT(powerof2(nqueues));
> > > Index: net/if.h
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/net/if.h,v
> > > diff -u -p -r1.221 if.h
> > > --- net/if.h	9 Sep 2025 09:16:18 -0000	1.221
> > > +++ net/if.h	25 Sep 2025 08:27:45 -0000
> > > @@ -526,6 +526,9 @@ struct if_sffpage {
> > >  #include <net/if_arp.h>
> > >  
> > >  #ifdef _KERNEL
> > > +
> > > +#define IF_MAX_VECTORS		8
> > > +
> > >  struct socket;
> > >  struct ifnet;
> > >  struct ifq_ops;
> > > Index: net/ifq.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/net/ifq.c,v
> > > diff -u -p -r1.62 ifq.c
> > > --- net/ifq.c	28 Jul 2025 05:25:44 -0000	1.62
> > > +++ net/ifq.c	25 Sep 2025 08:27:45 -0000
> > > @@ -255,7 +255,7 @@ void
> > >  ifq_init(struct ifqueue *ifq, struct ifnet *ifp, unsigned int idx)
> > >  {
> > >  	ifq->ifq_if = ifp;
> > > -	ifq->ifq_softnet = net_tq(idx);
> > > +	ifq->ifq_softnet = net_tq(ifp->if_index * IF_MAX_VECTORS + idx);
> > >  	ifq->ifq_softc = NULL;
> > >  
> > >  	mtx_init(&ifq->ifq_mtx, IPL_NET);
> > > 
> > > 
> > 
> 
>