From: Jan Klemkow Subject: Re: vmx TSO pullup headers into first mbuf To: Alexander Bluhm Cc: tech@openbsd.org Date: Fri, 14 Jun 2024 17:44:31 +0200 On Thu, Jun 13, 2024 at 12:11:24AM GMT, Alexander Bluhm wrote: > In vmx(4) TSO must pullup headers into first mbuf. > > Forwarding IPv6 packets from vmx with LRO to vmx with TSO does not > work. vmx(4) has the requirement that all headers are in the first > mbuf. ip6_forward() is quite dumb. It calls m_copym() to create > a mbuf that might be used for sending ICMP6 later. After passing > the forwarded packet down to ether_encap(), m_prepend() is used to > restore the ethernet header. As the mbuf cluster has been copied, > it is read only now. That means m_prepend() does not provide the > empty space at the beginning of the cluster, but allocates a new > mbuf that contains only the ethernet header. vmx(4) cannot transmit > such a TSO packet and drops it. > > Solution is to call m_pullup() in vmxnet3_start(). If we ended up > in such a miserable condition, use the first mbuf in the chain and > move all headers into it. > > ok? Looks good. Comment below. Thanks, Jan > Index: dev/pci/if_vmx.c > =================================================================== > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_vmx.c,v > diff -u -p -r1.87 if_vmx.c > --- dev/pci/if_vmx.c 7 Jun 2024 08:44:25 -0000 1.87 > +++ dev/pci/if_vmx.c 12 Jun 2024 14:34:11 -0000 > @@ -1619,6 +1619,8 @@ vmxnet3_start(struct ifqueue *ifq) > rgen = ring->gen; > > for (;;) { > + int hdrlen; > + > if (free <= NTXSEGS) { > ifq_set_oactive(ifq); > break; > @@ -1627,6 +1629,29 @@ vmxnet3_start(struct ifqueue *ifq) > m = ifq_dequeue(ifq); > if (m == NULL) > break; > + > + /* > + * Headers for Ether, IP, TCP including options must lay in > + * first mbuf to support TSO. Usually our stack gets that > + * right. To avoid packet parsing here, make a rough estimate > + * for simple IPv4. Cases seen in the wild contain only ether > + * header in separate mbuf. To support IPv6 with TCP options, > + * move as much as possible into first mbuf. Realloc mbuf > + * before bus dma load. > + */ > + hdrlen = sizeof(struct ether_header) + sizeof(struct ip) + > + sizeof(struct tcphdr); > + if (ISSET(m->m_pkthdr.csum_flags, M_TCP_TSO) && > + m->m_len < hdrlen && hdrlen <= m->m_pkthdr.len) { > + hdrlen = MHLEN; > + hdrlen -= mtod(m, unsigned long) & (sizeof(long) - 1); I would clarify this line with an extra comment. Thus, non-hardcore kernel hacker also know what it does and why its necessary. ok jan@ > + if (hdrlen > m->m_pkthdr.len) > + hdrlen = m->m_pkthdr.len; > + if ((m = m_pullup(m, hdrlen)) == NULL) { > + ifq->ifq_errors++; > + continue; > + } > + } > > map = ring->dmap[prod]; > >