Index | Thread | Search

From:
Alexander Bluhm <bluhm@openbsd.org>
Subject:
vmx TSO pullup headers into first mbuf
To:
tech@openbsd.org
Date:
Thu, 13 Jun 2024 00:11:24 +0200

Download raw body.

Thread
Hi,

In vmx(4) TSO must pullup headers into first mbuf.

Forwarding IPv6 packets from vmx with LRO to vmx with TSO does not
work.  vmx(4) has the requirement that all headers are in the first
mbuf.  ip6_forward() is quite dumb.  It calls m_copym() to create
a mbuf that might be used for sending ICMP6 later.  After passing
the forwarded packet down to ether_encap(), m_prepend() is used to
restore the ethernet header.  As the mbuf cluster has been copied,
it is read only now.  That means m_prepend() does not provide the
empty space at the beginning of the cluster, but allocates a new
mbuf that contains only the ethernet header.  vmx(4) cannot transmit
such a TSO packet and drops it.

Solution is to call m_pullup() in vmxnet3_start().  If we ended up
in such a miserable condition, use the first mbuf in the chain and
move all headers into it.

ok?

bluhm

Index: dev/pci/if_vmx.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_vmx.c,v
diff -u -p -r1.87 if_vmx.c
--- dev/pci/if_vmx.c	7 Jun 2024 08:44:25 -0000	1.87
+++ dev/pci/if_vmx.c	12 Jun 2024 14:34:11 -0000
@@ -1619,6 +1619,8 @@ vmxnet3_start(struct ifqueue *ifq)
 	rgen = ring->gen;
 
 	for (;;) {
+		int hdrlen;
+
 		if (free <= NTXSEGS) {
 			ifq_set_oactive(ifq);
 			break;
@@ -1627,6 +1629,29 @@ vmxnet3_start(struct ifqueue *ifq)
 		m = ifq_dequeue(ifq);
 		if (m == NULL)
 			break;
+
+		/*
+		 * Headers for Ether, IP, TCP including options must lay in
+		 * first mbuf to support TSO.  Usually our stack gets that
+		 * right. To avoid packet parsing here, make a rough estimate
+		 * for simple IPv4.  Cases seen in the wild contain only ether 
+		 * header in separate mbuf.  To support IPv6 with TCP options,
+		 * move as much as possible into first mbuf.  Realloc mbuf
+		 * before bus dma load.
+		 */
+		hdrlen = sizeof(struct ether_header) + sizeof(struct ip) +
+		    sizeof(struct tcphdr);
+		if (ISSET(m->m_pkthdr.csum_flags, M_TCP_TSO) &&
+		    m->m_len < hdrlen && hdrlen <= m->m_pkthdr.len) {
+			hdrlen = MHLEN;
+			hdrlen -= mtod(m, unsigned long) & (sizeof(long) - 1);
+			if (hdrlen > m->m_pkthdr.len)
+				hdrlen = m->m_pkthdr.len;
+			if ((m = m_pullup(m, hdrlen)) == NULL) {
+				ifq->ifq_errors++;
+				continue;
+			}
+		}
 
 		map = ring->dmap[prod];