Download raw body.
igc(4): fix recv. jumbo frames
On Thu, Aug 08, 2024 at 12:51:18PM +0200, Jan Klemkow wrote:
> On Wed, Aug 07, 2024 at 11:38:53PM +0200, Alexander Bluhm wrote:
> > On Thu, Aug 01, 2024 at 06:48:17PM +0200, Jan Klemkow wrote:
> > > On Wed, Jul 31, 2024 at 10:39:25AM -0600, Theo de Raadt wrote:
> > > > Alexander Bluhm <bluhm@openbsd.org> wrote:
> > > > > On Tue, Jul 30, 2024 at 08:12:08AM +0200, Jan Klemkow wrote:
> > > > > > The DMA mapping and allocation of mbufs for jumbo frames uses different
> > > > > > sizes. Thus, we are ending up with corrupt mbufs, which leads to panics
> > > > > > in later part of the TCP/IP stack:
> > > > > >
> > > > > > panic: kernel diagnostic assertion "M_DATABUF(m) + M_SIZE(m) >= (m->m_data + m->m_len)"
> > > > > > failed: file "/usr/src/sys/kern/uipc_mbuf.c", line 1364
> > > > > >
> > > > > > With the following diff, we use the same size for mapping and
> > > > > > allocation.
> > > > > >
> > > > > > ok?
> > > > >
> > > > > I have tested this diff successfully. Before jumbo frames with
> > > > > igc(4) paniced, now TCP is faster.
> > > > >
> > > > > But are we allocating too much in the non-jumbo case?
> > > > > sc->rx_mbuf_sz = MAX_JUMBO_FRAME_SIZE + ETHER_ALIGN;
> > > > > m = MCLGETL(NULL, M_DONTWAIT, sc->rx_mbuf_sz)
> > > > >
> > > > > This means for each receive packet we allocate a 16k mbuf cluster.
> > > > > Usually only 1500 byte are needed.
> > > > >
> > > > > ix(4) has this code and manages to chain the mbufs correctly.
> > > > > /* Use 2k clusters, even for jumbo frames */
> > > > > sc->rx_mbuf_sz = MCLBYTES + ETHER_ALIGN;
> > > >
> > > > This hardware chains fragments in the ring?
> > > >
> > > > It's a tradeoff. jumbos are rare, so why reserve so much memory to them
> > > > on hardware which can do better.
> > >
> > > I oriented my diff at the FreeBSD code. Also, the Linux code looks like
> > > as they don't use scatter/gather DMA.
> > >
> > > But, It works with the following diff below.
> > >
> > > The extra size of ETHER_ALIGN seems be wrong. It also leads to panics
> > > caused by corrupted mbufs. FreeBSD just use MCLBYTES for rx mbufs size.
> > > Also NetBSD removed ETHER_ALIGN from their copy of our driver [1].
> > >
> > > [1]: https://github.com/NetBSD/src/commit/fb38d839b48b9b6204dbbee1672454d6e719ba01
> > >
> > > Index: dev/pci/if_igc.c
> > > ===================================================================
> > > RCS file: /mount/openbsd/cvs/src/sys/dev/pci/if_igc.c,v
> > > diff -u -p -r1.25 if_igc.c
> > > --- dev/pci/if_igc.c 24 May 2024 06:02:53 -0000 1.25
> > > +++ dev/pci/if_igc.c 1 Aug 2024 15:45:58 -0000
> > > @@ -881,7 +881,7 @@ igc_init(void *arg)
> > > }
> > > igc_initialize_transmit_unit(sc);
> > >
> > > - sc->rx_mbuf_sz = MCLBYTES + ETHER_ALIGN;
> > > + sc->rx_mbuf_sz = MCLBYTES;
> > > /* Prepare receive descriptors and buffers. */
> > > if (igc_setup_receive_structures(sc)) {
> > > printf("%s: Could not setup receive structures\n",
> > > @@ -2159,7 +2159,7 @@ igc_allocate_receive_buffers(struct igc_
> > > rxbuf = rxr->rx_buffers;
> > > for (i = 0; i < sc->num_rx_desc; i++, rxbuf++) {
> > > error = bus_dmamap_create(rxr->rxdma.dma_tag,
> > > - MAX_JUMBO_FRAME_SIZE, 1, MAX_JUMBO_FRAME_SIZE, 0,
> > > + MAX_JUMBO_FRAME_SIZE, IGC_MAX_SCATTER, MCLBYTES, 0,
> > > BUS_DMA_NOWAIT, &rxbuf->map);
> > > if (error) {
> > > printf("%s: Unable to create RX DMA map\n",
> >
> > The diff above survives my tests. But the #define IGC_MAX_SCATTER
> > 40 seems like a magic value. It is nowhere mentioned in the spec.
> > With MAX_JUMBO_FRAME_SIZE == 9216 and MCLBYTES == 2048 there can
> > be at most 5 segements. The diff below also passes my tests.
> >
> > Is the more specific value IGC_MAX_GATHER == 5 better?
>
> I get your point to change the name from SCATTER to something else, but
> GATHER isn't used in any other driver. In some cases the value is just
> hardcoded all other cases use something like MAX_SEGMENTS or MAX_SEGS.
> This also better matches with the parameter nsegments of
> bus_dmamap_create().
>
> > Or could the hardware try to create larger frames?
>
> bnx(4) uses 8 instead of 5 for busdma mappings for instance. And many
> other driver just map and allocate jumbo size or hardmtu in one segment
> for their receive buffers.
>
> Thus I would suggest to increase it a bit. Better safe the sorry, till
> we know a accurate or official value for this limit.
>
> ok?
After testing all the diffs and looking at various spontaneous
failures, I come to the conclusion that the IGC_MAX_SCATTER == 40
diff works best. It is hard to tell if a sinlge test failure is
coincidence or a flaw in the software. But it looks like my
IGC_MAX_GATHER == 5 did not work well with jumbo frames and TCP
socket splicing. IGC_MAX_SEGS == 8 fixes this, but now some UDP
tests fail sometimes.
That IGC_MAX_SCATTER == 40 had no such failures may be luck. Let's
commit your original diff above to be on the safe side.
OK bluhm@ for MAX_JUMBO_FRAME_SIZE, IGC_MAX_SCATTER, MCLBYTES
After that I will activate regular tests with jumbo frames for all
drivers. There are more bugs to find.
igc(4): fix recv. jumbo frames