Index | Thread | Search

From:
Jan Klemkow <jan@openbsd.org>
Subject:
Re: vmx TSO pullup headers into first mbuf
To:
Alexander Bluhm <bluhm@openbsd.org>
Cc:
tech@openbsd.org
Date:
Fri, 14 Jun 2024 17:44:31 +0200

Download raw body.

Thread
On Thu, Jun 13, 2024 at 12:11:24AM GMT, Alexander Bluhm wrote:
> In vmx(4) TSO must pullup headers into first mbuf.
> 
> Forwarding IPv6 packets from vmx with LRO to vmx with TSO does not
> work.  vmx(4) has the requirement that all headers are in the first
> mbuf.  ip6_forward() is quite dumb.  It calls m_copym() to create
> a mbuf that might be used for sending ICMP6 later.  After passing
> the forwarded packet down to ether_encap(), m_prepend() is used to
> restore the ethernet header.  As the mbuf cluster has been copied,
> it is read only now.  That means m_prepend() does not provide the
> empty space at the beginning of the cluster, but allocates a new
> mbuf that contains only the ethernet header.  vmx(4) cannot transmit
> such a TSO packet and drops it.
> 
> Solution is to call m_pullup() in vmxnet3_start().  If we ended up
> in such a miserable condition, use the first mbuf in the chain and
> move all headers into it.
> 
> ok?

Looks good.  Comment below.

Thanks,
Jan

> Index: dev/pci/if_vmx.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_vmx.c,v
> diff -u -p -r1.87 if_vmx.c
> --- dev/pci/if_vmx.c	7 Jun 2024 08:44:25 -0000	1.87
> +++ dev/pci/if_vmx.c	12 Jun 2024 14:34:11 -0000
> @@ -1619,6 +1619,8 @@ vmxnet3_start(struct ifqueue *ifq)
>  	rgen = ring->gen;
>  
>  	for (;;) {
> +		int hdrlen;
> +
>  		if (free <= NTXSEGS) {
>  			ifq_set_oactive(ifq);
>  			break;
> @@ -1627,6 +1629,29 @@ vmxnet3_start(struct ifqueue *ifq)
>  		m = ifq_dequeue(ifq);
>  		if (m == NULL)
>  			break;
> +
> +		/*
> +		 * Headers for Ether, IP, TCP including options must lay in
> +		 * first mbuf to support TSO.  Usually our stack gets that
> +		 * right. To avoid packet parsing here, make a rough estimate
> +		 * for simple IPv4.  Cases seen in the wild contain only ether 
> +		 * header in separate mbuf.  To support IPv6 with TCP options,
> +		 * move as much as possible into first mbuf.  Realloc mbuf
> +		 * before bus dma load.
> +		 */
> +		hdrlen = sizeof(struct ether_header) + sizeof(struct ip) +
> +		    sizeof(struct tcphdr);
> +		if (ISSET(m->m_pkthdr.csum_flags, M_TCP_TSO) &&
> +		    m->m_len < hdrlen && hdrlen <= m->m_pkthdr.len) {
> +			hdrlen = MHLEN;

> +			hdrlen -= mtod(m, unsigned long) & (sizeof(long) - 1);

I would clarify this line with an extra comment.  Thus, non-hardcore
kernel hacker also know what it does and why its necessary.

ok jan@

> +			if (hdrlen > m->m_pkthdr.len)
> +				hdrlen = m->m_pkthdr.len;
> +			if ((m = m_pullup(m, hdrlen)) == NULL) {
> +				ifq->ifq_errors++;
> +				continue;
> +			}
> +		}
>  
>  		map = ring->dmap[prod];
>  
>