From: Janne Johansson Subject: Re: SoftLRO for ixl(4), bnxt(4) and em(4) To: Jan Klemkow Cc: tech@openbsd.org Date: Wed, 18 Dec 2024 19:17:46 +0100 > This diff introduces a software solution for TCP Large Receive Offload > (SoftLRO) for network interfaces don't hat hardware support for it. > This is needes at least for newer Intel interfaces as their > documentation said that LRO a.k.a. Receive Side Coalescing (RSC) has to > be done by software. > This diff coalesces TCP segments during the receive interrupt before > queueing them. Thus, our TCP/IP stack has to process less packet > headers per amount of received data. > > Even if we saturate em(4) without any of these technique its also part > this diff. I'm interested if this diff helps to reach 1 Gbit/s on old > or slow hardware. I tested your latest patch on one of my octeons with cnmac(4) interfaces. > If you want > to tests this implementation with your favorite interface, just replace > the ml_enqueue() call with the new tcp_softlro_enqueue() (as seen > below). It should work with all kind network interfaces. > > Any comments and tests reports are welcome. It seems to give my box 20-25% more iperf3 bandwidth, from 200-220 Mbit/s to 260-270Mbit/s while generating and sinking on the same machine. I could not just include tcp_var.h into if_cnmac.c since lots of other things would then be undefined, so I just added the prototype in the file for the softlro call. sorry if gmail formatting breaks the patch, but here it is: $ cvs -q diff -u Index: if_cnmac.c =================================================================== RCS file: /cvs/src/sys/arch/octeon/dev/if_cnmac.c,v retrieving revision 1.86 diff -u -p -u -r1.86 if_cnmac.c --- if_cnmac.c 20 May 2024 23:13:33 -0000 1.86 +++ if_cnmac.c 18 Dec 2024 18:02:08 -0000 @@ -173,6 +173,8 @@ int cnmac_kstat_read(struct kstat *); void cnmac_kstat_tick(struct cnmac_softc *); #endif +void tcp_softlro_enqueue(struct mbuf_list *, struct mbuf *); + /* device parameters */ int cnmac_param_pko_cmd_w0_n2 = 1; @@ -306,7 +308,7 @@ cnmac_attach(struct device *parent, stru strncpy(ifp->if_xname, sc->sc_dev.dv_xname, sizeof(ifp->if_xname)); ifp->if_softc = sc; ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST; - ifp->if_xflags = IFXF_MPSAFE; + ifp->if_xflags = IFXF_MPSAFE|IFXF_LRO; ifp->if_ioctl = cnmac_ioctl; ifp->if_qstart = cnmac_start; ifp->if_watchdog = cnmac_watchdog; @@ -314,7 +316,8 @@ cnmac_attach(struct device *parent, stru ifq_init_maxlen(&ifp->if_snd, max(GATHER_QUEUE_SIZE, IFQ_MAXLEN)); ifp->if_capabilities = IFCAP_VLAN_MTU | IFCAP_CSUM_TCPv4 | - IFCAP_CSUM_UDPv4 | IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6; + IFCAP_CSUM_UDPv4 | IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6 | + IFCAP_LRO; cn30xxgmx_set_filter(sc->sc_gmx_port); @@ -1246,7 +1249,10 @@ cnmac_recv(struct cnmac_softc *sc, uint6 M_TCP_CSUM_IN_OK | M_UDP_CSUM_IN_OK; } - ml_enqueue(ml, m); + if (ISSET(ifp->if_xflags, IFXF_LRO)) + tcp_softlro_enqueue(ml, m); + else + ml_enqueue(ml, m); return nmbuf; -- May the most significant bit of your life be positive.