Index | Thread | Search

From:
Jonathan Matthew <jonathan@d14n.org>
Subject:
Re: spread 8 network interrupt over cpu and softnet
To:
Alexander Bluhm <bluhm@openbsd.org>
Cc:
tech@openbsd.org
Date:
Mon, 13 Oct 2025 14:04:03 +1000

Download raw body.

Thread
On Tue, Oct 07, 2025 at 02:44:24PM +0200, Alexander Bluhm wrote:
> On Tue, Oct 07, 2025 at 05:33:35PM +1000, Jonathan Matthew wrote:
> > On Sat, Sep 27, 2025 at 06:36:50PM +0200, Alexander Bluhm wrote:
> > > Hi,
> > > 
> > > Currently most network drivers use 8 queues and distribute interrupts.
> > > ix(4) is the only one that may allocate more than 8.  ice(8) uses
> > > only 8 vectors, but allocates more interrupts.
> > > 
> > > I would like to limit interrupts and queues to 8 for all drivers.
> > > This prevents running out of interrupt vectors so easily.  Currently
> > > we have 8 softnet threads which limits parallel processing anyway.
> > > We can tune these values later.
> > > 
> > > With this diff, forwarding througput increases from 45 to 50 GBit/sec
> > > on my 12 core machine from ice0 to ice1.  I am using iperf3 TCP on
> > > Linux to measure it.  I think slower performance happend because
> > > ice(4) was allocating more interrupts than it has vectors.  Now
> > > everything is limited to 8.  For other drivers I see no difference
> > > as they operate at line speed anyway.
> > > 
> > > I think arbitrary numbers IXL_MAX_VECTORS, IGC_MAX_VECTORS and
> > > IGC_MAX_VECTORS could go away, but that would be per driver diffs.
> > 
> > I think the changes to bnxt, igc and ixl, where you're just picking the
> > smaller of two constants, neither of which reflects an actual limit on
> > the number of queues available, should be removed from the diff.
> > We've mostly settled on 8 as the maximum number of queues to use anyway,
> > so this doesn't really change anything.
> 
> Are there any hardware limits?  As I want to play with IF_MAX_VECTORS
> globally, it would be nice to know the capabilites of the hardware.
> That's why I wanted to address this on a per driver basis.

> 
> igc(4) says
> #define IGC_MAX_VECTORS                8
> without explanation.  Is this an actual hardware limit?

The I225/226 datasheet says the hardware is limited to 4 queues.
On the only igc hardware I have, the msi-x table has 5 entries,
so we could never use 8 queues there anyway.

> 
> ix has these lines
>         /* XXX the number of queues is limited to what we can keep stats on */
>         maxq = (sc->hw.mac.type == ixgbe_mac_82598EB) ? 8 : 16;
> 
> Are the 8 and 16 restrictions of the hardware?  IF_MAX_VECTORS
> should be the value for optimal system behavior.  On top each driver
> has its own limitations.  That's how I came to the minimum calculation.

The limit here is the number of stats registers.  ix hardware has 64
tx/rx queues, but enough registers for reading statistics off all of them.

Other cases aside from ice:
bnxt - nic firmware gives us limits on the number of msi-x vectors and tx/rx
queues, but we currently ignore them and assume 8 will be available.

ixl - the datasheet says we can create up to 1536 tx/rx queue pairs, and
the number of msi-x vectors available is given in the pci capability
structure, which we already take into account.

iavf - limited to 4 queues according to the virtual function interface
specification.

mcx - nic firmware gives us limits on the number of send/receive queues,
completion queues and event queues, the lowest of which we should use as
the maximum number of queues, but we currently just assume we can use 16
queues.

aq - register layout only has space for 8 queues as far as I can tell,
so 8 is a hardware limit.  Seems to have a bigger msi-x table though.

vmx - seems to be limited to 8 tx and 16 rx queues, so we use 8 at most.

ngbe - datasheet says 8 tx and rx queues.

> 
> bluhm
> 
> > Removing the per-driver constants and just using IF_MAX_VECTORS instead
> > sounds more useful to me.
> > 
> > > 
> > > Second part of the diff spreads the interrupts of network devices
> > > equally over the softnet tasks.
> > > 
> > > ok?
> > > 
> > > bluhm
> > > 
> > > Index: dev/pci/if_bnxt.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_bnxt.c,v
> > > diff -u -p -r1.56 if_bnxt.c
> > > --- dev/pci/if_bnxt.c	5 Sep 2025 09:58:24 -0000	1.56
> > > +++ dev/pci/if_bnxt.c	25 Sep 2025 08:27:45 -0000
> > > @@ -545,9 +545,11 @@ bnxt_attach(struct device *parent, struc
> > >  		nmsix = pci_intr_msix_count(pa);
> > >  		if (nmsix > 1) {
> > >  			sc->sc_ih = pci_intr_establish(sc->sc_pc, ih,
> > > -			    IPL_NET | IPL_MPSAFE, bnxt_admin_intr, sc, DEVNAME(sc));
> > > -			sc->sc_intrmap = intrmap_create(&sc->sc_dev,
> > > -			    nmsix - 1, BNXT_MAX_QUEUES, INTRMAP_POWEROF2);
> > > +			    IPL_NET | IPL_MPSAFE, bnxt_admin_intr, sc,
> > > +			    DEVNAME(sc));
> > > +			sc->sc_intrmap = intrmap_create(&sc->sc_dev, nmsix - 1,
> > > +			    MIN(BNXT_MAX_QUEUES, IF_MAX_VECTORS),
> > > +			    INTRMAP_POWEROF2);
> > >  			sc->sc_nqueues = intrmap_count(sc->sc_intrmap);
> > >  			KASSERT(sc->sc_nqueues > 0);
> > >  			KASSERT(powerof2(sc->sc_nqueues));
> > > Index: dev/pci/if_ice.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_ice.c,v
> > > diff -u -p -r1.59 if_ice.c
> > > --- dev/pci/if_ice.c	17 Sep 2025 12:54:19 -0000	1.59
> > > +++ dev/pci/if_ice.c	27 Sep 2025 07:35:25 -0000
> > > @@ -30501,7 +30501,8 @@ ice_attach_hook(struct device *self)
> > >  	sc->sc_nmsix = nmsix;
> > >  	nqueues_max = MIN(sc->isc_nrxqsets_max, sc->isc_ntxqsets_max);
> > >  	sc->sc_intrmap = intrmap_create(&sc->sc_dev, sc->sc_nmsix - 1,
> > > -	    nqueues_max, INTRMAP_POWEROF2);
> > > +	    MIN(MIN(nqueues_max, ICE_MAX_VECTORS), IF_MAX_VECTORS),
> > > +	    INTRMAP_POWEROF2);
> > >  	nqueues = intrmap_count(sc->sc_intrmap);
> > >  	KASSERT(nqueues > 0);
> > >  	KASSERT(powerof2(nqueues));
> > > Index: dev/pci/if_igc.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_igc.c,v
> > > diff -u -p -r1.28 if_igc.c
> > > --- dev/pci/if_igc.c	24 Jun 2025 11:00:27 -0000	1.28
> > > +++ dev/pci/if_igc.c	27 Sep 2025 07:29:57 -0000
> > > @@ -724,8 +724,8 @@ igc_setup_msix(struct igc_softc *sc)
> > >  	/* Give one vector to events. */
> > >  	nmsix--;
> > >  
> > > -	sc->sc_intrmap = intrmap_create(&sc->sc_dev, nmsix, IGC_MAX_VECTORS,
> > > -	    INTRMAP_POWEROF2);
> > > +	sc->sc_intrmap = intrmap_create(&sc->sc_dev, nmsix,
> > > +	    MIN(IGC_MAX_VECTORS, IF_MAX_VECTORS), INTRMAP_POWEROF2);
> > >  	sc->sc_nqueues = intrmap_count(sc->sc_intrmap);
> > >  }
> > >  
> > > Index: dev/pci/if_ix.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_ix.c,v
> > > diff -u -p -r1.221 if_ix.c
> > > --- dev/pci/if_ix.c	24 Jun 2025 11:02:03 -0000	1.221
> > > +++ dev/pci/if_ix.c	25 Sep 2025 08:27:45 -0000
> > > @@ -1854,7 +1854,8 @@ ixgbe_setup_msix(struct ix_softc *sc)
> > >  	/* XXX the number of queues is limited to what we can keep stats on */
> > >  	maxq = (sc->hw.mac.type == ixgbe_mac_82598EB) ? 8 : 16;
> > >  
> > > -	sc->sc_intrmap = intrmap_create(&sc->dev, nmsix, maxq, 0);
> > > +	sc->sc_intrmap = intrmap_create(&sc->dev, nmsix,
> > > +	    MIN(maxq, IF_MAX_VECTORS), 0);
> > >  	sc->num_queues = intrmap_count(sc->sc_intrmap);
> > >  }
> > >  
> > > Index: dev/pci/if_ixl.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_ixl.c,v
> > > diff -u -p -r1.109 if_ixl.c
> > > --- dev/pci/if_ixl.c	17 Sep 2025 12:54:19 -0000	1.109
> > > +++ dev/pci/if_ixl.c	27 Sep 2025 07:29:57 -0000
> > > @@ -1782,8 +1782,9 @@ ixl_attach(struct device *parent, struct
> > >  		if (nmsix > 1) { /* we used 1 (the 0th) for the adminq */
> > >  			nmsix--;
> > >  
> > > -			sc->sc_intrmap = intrmap_create(&sc->sc_dev,
> > > -			    nmsix, IXL_MAX_VECTORS, INTRMAP_POWEROF2);
> > > +			sc->sc_intrmap = intrmap_create(&sc->sc_dev, nmsix,
> > > +			    MIN(IXL_MAX_VECTORS, IF_MAX_VECTORS),
> > > +			    INTRMAP_POWEROF2);
> > >  			nqueues = intrmap_count(sc->sc_intrmap);
> > >  			KASSERT(nqueues > 0);
> > >  			KASSERT(powerof2(nqueues));
> > > Index: net/if.h
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/net/if.h,v
> > > diff -u -p -r1.221 if.h
> > > --- net/if.h	9 Sep 2025 09:16:18 -0000	1.221
> > > +++ net/if.h	25 Sep 2025 08:27:45 -0000
> > > @@ -526,6 +526,9 @@ struct if_sffpage {
> > >  #include <net/if_arp.h>
> > >  
> > >  #ifdef _KERNEL
> > > +
> > > +#define IF_MAX_VECTORS		8
> > > +
> > >  struct socket;
> > >  struct ifnet;
> > >  struct ifq_ops;
> > > Index: net/ifq.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/net/ifq.c,v
> > > diff -u -p -r1.62 ifq.c
> > > --- net/ifq.c	28 Jul 2025 05:25:44 -0000	1.62
> > > +++ net/ifq.c	25 Sep 2025 08:27:45 -0000
> > > @@ -255,7 +255,7 @@ void
> > >  ifq_init(struct ifqueue *ifq, struct ifnet *ifp, unsigned int idx)
> > >  {
> > >  	ifq->ifq_if = ifp;
> > > -	ifq->ifq_softnet = net_tq(idx);
> > > +	ifq->ifq_softnet = net_tq(ifp->if_index * IF_MAX_VECTORS + idx);
> > >  	ifq->ifq_softc = NULL;
> > >  
> > >  	mtx_init(&ifq->ifq_mtx, IPL_NET);
> > > 
>