Index | Thread | Search

From:
Jan Klemkow <jan@openbsd.org>
Subject:
Re: SoftLRO for ixl(4), bnxt(4) and em(4)
To:
Andrew Lemin <andrew.lemin@gmail.com>
Cc:
Mark Patruck <mark@wrapped.cx>, Alexander Bluhm <bluhm@openbsd.org>, tech@openbsd.org, Janne Johansson <icepic.dz@gmail.com>, Yuichiro NAITO <naito.yuichiro@gmail.com>
Date:
Sat, 5 Apr 2025 13:57:08 +0200

Download raw body.

Thread
  • Lucas Gabriel Vuotto:

    SoftLRO for ixl(4), bnxt(4) and em(4)

  • Janne Johansson:

    SoftLRO for ixl(4), bnxt(4) and em(4)

  • Hi Andrew,
    
    On Fri, Mar 28, 2025 at 04:00:18PM GMT, Andrew Lemin wrote:
    > Hi. Nice! Really great to see this :)
    > 
    > Jan; "But, do we want to merge TCP segments with VLAN tags that differ in
    > priority bits?  And what priority bits should we choose for the resulting
    > packet?"
    
    I reread dlg@s reply to my first diff.  I guess he wants me to use both
    VLAN macros: EVL_VLANOFTAG() and EVL_PRIOFTAG().  I'll send a new diff.
    
    > I would suggest because LRO creates a form of buffering, we do not want to
    
    This SoftLRO implementation do not create additional buffering for
    network packets.  It just merges them while copying them from the
    already existing receive buffer to our internal queues.
    
    > merge packets with different Prio's. For example on the same session/tuple
    > in VLAN, we mark Prio differently according to DiffServ and for ACKs.
    > Because we want ACKs to be forwarded first in a priority PF queue, whilst
    > the payload packets can be buffered in a deeper PF queue, allowing a
    > long-fat-link to be filled.
    > 
    > Yuichiro; "Hi, I tested SoftLRO patch, and as you said, I saw that TCP
    > receive performance improved. However, packet forwarding performance has
    > decreased."
    > 
    > This is definitely common with LRO as packets are being delayed by the
    > buffering. Which impacts delay-based CC algorithms. Is there a limit on how
    > many packets we merge to limit LRO buffer delay?
    
    This algorithm don't wait for packets to be merged.  It just merges the
    packets in the receive buffer.  Thus, no additional delay.
    
    > Your testing was interesting Yuichiro. When you said you were getting lots
    > of ACKs, and showed the sequence number would plateau briefly, were you
    > getting lots of retransmitted (S)ACKs?
    > 
    > From Bluhn comment; "I got a uvm fault in ixl_txeof() with this diff.  Test
    > was sending single stream TCP from Linux to Linux machine while OpenBSD was
    > forwarding.  ixl(4) LRO was activated.  So it looks like it was sending a
    > packet that was previously received with LRO."
    > 
    > That might explain why, if it was retransmitting ACKs.. And why using LRO
    > on the endpoints might be masking the issue (by slowing down SACK)?
    
    SACK packages are ignored by this SoftLRO implementation and are also
    ignored by the ix(4) Intel NICs according to their technical
    documentation.
    
    > Yuichiro; "One solution of this issue is reducing the number of ACK packets
    > from the FreeBSD kernel. I set LRO option in the iperf3 server interface
    > (I'm using X520 nics for the FreeBSD host). The LRO feature merges received
    > TCP packets of the same stream, the number of packets received for the
    > FreeBSD kernel will be reduced, so that the number of ACKs returned will be
    > reduced.
    > 
    > I think this workaround only makes sense when packet loss is low on
    > ethernet. When one of the links is Wifi, and the client does not do LRO,
    > then we have worse performance. We usually don't want to limit the number
    > of ACKs on lossy links, as ACKs/SACKs (and the resultant retransmits)
    > should not be delayed.
    > 
    > Thinking through everything, it seems we should not merge ACKs, and should
    > pass them immediately to avoid ACK buffering, allowing endpoints to know
    > packets are received and the pipe can continue to be filled more, as the
    > bandwidth delay product now includes the LRO buffering.
    
    This SoftLRO implementation DO NOT merge SACK or any other ACK packets.
    
    > And when merging packets, do we stop merging when we see a missing sequence
    > number/packet?
    > Ie, If we have a lossy link, we merge as we can (received in sequence
    > without any losses), but when losses occur, we pass what we
    > have immediately, to allow SACKs to work and recover the lost packets etc?
    
    You may missed it, just read the code after the comment "Check for
    continues segments." again.
    
    > And what about out of order packets? Eg if a packet is out of order, but in
    > receive buf, do we reorder during merge (look behind probably too
    > expensive). Guessing it is more likely we see the hole and just forward the
    > contiguous packets we have immediately (causing a retransmission for the
    > out of order packet, even though we already received it)?
    
    This implementation do not reorder packets of the same TCP connection.
    
    > Sadly I have moved home recently, and my lab is all in boxes. But I'm keen
    > to test with "WIFI7 client + 10G WIFI7 AP + OpenBSD router with ixl NICs +
    > FreeBSD server with mcx NICs".
    > 
    > Thanks for your great work :)
    
    You are welcome.
    
    Thanks,
    Jan
    
    
  • Lucas Gabriel Vuotto:

    SoftLRO for ixl(4), bnxt(4) and em(4)

  • Janne Johansson:

    SoftLRO for ixl(4), bnxt(4) and em(4)