Index | Thread | Search

From:
Claudio Jeker <cjeker@diehard.n-r-g.com>
Subject:
Re: TSO with small packets
To:
Alexander Bluhm <alexander.bluhm@gmx.net>
Cc:
tech@openbsd.org
Date:
Fri, 9 Feb 2024 13:50:53 +0100

Download raw body.

Thread
On Fri, Feb 09, 2024 at 01:37:45PM +0100, Alexander Bluhm wrote:
> On Fri, Feb 09, 2024 at 12:11:52PM +0100, Claudio Jeker wrote:
> > On Fri, Feb 09, 2024 at 11:37:32AM +0100, Alexander Bluhm wrote:
> > > Hi,
> > > 
> > > After a lot of debugging and testing with Hrvoje and mglocker@, we
> > > have found the cause for occasional watchdog timeouts with em TSO
> > > diff.
> > > 
> > > In this setup TCP packets are routed from ix(4) with LRO to em(4)
> > > with TSO.  It happens that ix hardware coalesces packets that are
> > > in total smaller than MTU.  In real live this is rare as either you
> > > have large TCP packets in a row, or small TCP packets from time to
> > > time.
> > > 
> > > Small packets that went though LRO have the TSO bit set.  But TSO
> > > is not necessary as only few small TCP packets have been reassembled.
> > > This confuses em hardware.  So clear TSO flag during interface
> > > output path in that case.
> > > 
> > > This is only relevant when forwarding to hardware TSO.  Our stack
> > > does not generate such packets and can handle them.  Diff below
> > > prevents them to reach hardware TSO.
> > > 
> > > ok?
> > 
> > But shouldn't the system send out those small packets split up again?
> > At least I think the idea is that the effect of LRO is reversed while
> > sending out so that the same amount of packets are sent as without LRO/TSO.
> 
> The small packets in our case have 2 bytes payload after LRO.  We
> instruct em to split them in 1 byte packets and that fails with
> watchdog timeouts.  It works if I clear the TSO bit.
> 
> But you are right, my diff may break path MTU discovery.  So
> m_pkthdr.len <= mtu is not ideal.  Is there a minmum TSO size
> hardware can handle?  We might use that instead of MTU.

Probably the minumum IPv6 packet size (576 IIRC) would be a good minimum.
 
> bluhm
> 
> > > Index: netinet/tcp_output.c
> > > ===================================================================
> > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/tcp_output.c,v
> > > diff -u -p -r1.141 tcp_output.c
> > > --- netinet/tcp_output.c	26 Nov 2023 22:08:10 -0000	1.141
> > > +++ netinet/tcp_output.c	8 Feb 2024 21:17:50 -0000
> > > @@ -1360,6 +1360,12 @@ tcp_if_output_tso(struct ifnet *ifp, str
> > >  	/* caller must fail later or fragment */
> > >  	if (!ISSET((*mp)->m_pkthdr.csum_flags, M_TCP_TSO))
> > >  		return 0;
> > > +	/* send without hardware TSO if interface can handle packet size */
> > > +	if ((*mp)->m_pkthdr.len <= mtu) {
> > > +		CLR((*mp)->m_pkthdr.csum_flags, M_TCP_TSO);
> > > +		return 0;
> > > +	}
> > > +	/* fragment or fail if interface cannot handle size after chopping */
> > >  	if ((*mp)->m_pkthdr.ph_mss > mtu) {
> > >  		CLR((*mp)->m_pkthdr.csum_flags, M_TCP_TSO);
> > >  		return 0;
> > > 
> > 
> > -- 
> > :wq Claudio
> 

-- 
:wq Claudio