Download raw body.
ix(4) LRO/TSO fix
On 15.2.2024. 10:23, Claudio Jeker wrote: > On Thu, Feb 15, 2024 at 07:55:06AM +0100, Marcus Glocker wrote: >> Hi, >> >> When we did commit the em(4) TSO diff, we got an issues reported by >> Hrvoje, where a combination of ix(4), em(4), and vlan(4) was creating >> watchdog timeouts on em(4). >> >> In the ix(4) RX/LRO code path, the driver makes a decision whether to >> TSO tag the received packet for forwarding. For that, the header length >> gets calculated to find out about the packet payload. Two things go >> wrong in this calculation: >> >> 1. The VLAN header length is included. But the NIC cuts off the VLAN >> header already. >> >> 2. Possible padding (as seen for example on very small packets) is not >> considered, and counted as data payload. >> >> What happened after the em(4) TSO diff commit, is that ix(4) was >> receiving ACK packets with 0 bytes payload, but did calculate 2 bytes >> payload for it. Hence, the packet was TSO tagged and forwarded. Our >> TCP stack did re-calculate 0 bytes payload, and forwarded the packet to >> em(4). There it reached the new TSO code path which was requested to >> offload a packet with 0 bytes payload, which makes the TX fail, and >> creates an watchdog timeout. >> >> The following ix(4) diff is fixing the calculation as following: >> >> 1. Don't include the VLAN header length. >> >> 2. Do calculate the real packet data payload by using the total IP >> length field. With the last ether_extract_headers() commit, the >> IP length field can now be easily accessed. >> >> On the Hrvoje setup, no more em(4) watchdog timeouts can be seen with >> this diff. >> >> OK? > > Accessing the packet data inside the rxeof function is not ideal. > Mainly we fault in DMA memory into the CPU cache and then do the same > again later on the CPU handling the data. > In an ideal world the driver would only touch the metadata but never > access the DMA memory so that there is no cache pollution. As usual > reality is far from an ideal world. > > I wonder if other LRO capable devices will need similar changes and if in > that case this should be moved to the generic input processing. I have mcx, ixl and bnxt and can put them in that test box with em interface. Would that be ok ?
ix(4) LRO/TSO fix