Download raw body.
vmx TSO pullup headers into first mbuf
On Thu, Jun 13, 2024 at 12:11:24AM GMT, Alexander Bluhm wrote:
> In vmx(4) TSO must pullup headers into first mbuf.
>
> Forwarding IPv6 packets from vmx with LRO to vmx with TSO does not
> work. vmx(4) has the requirement that all headers are in the first
> mbuf. ip6_forward() is quite dumb. It calls m_copym() to create
> a mbuf that might be used for sending ICMP6 later. After passing
> the forwarded packet down to ether_encap(), m_prepend() is used to
> restore the ethernet header. As the mbuf cluster has been copied,
> it is read only now. That means m_prepend() does not provide the
> empty space at the beginning of the cluster, but allocates a new
> mbuf that contains only the ethernet header. vmx(4) cannot transmit
> such a TSO packet and drops it.
>
> Solution is to call m_pullup() in vmxnet3_start(). If we ended up
> in such a miserable condition, use the first mbuf in the chain and
> move all headers into it.
>
> ok?
Looks good. Comment below.
Thanks,
Jan
> Index: dev/pci/if_vmx.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_vmx.c,v
> diff -u -p -r1.87 if_vmx.c
> --- dev/pci/if_vmx.c 7 Jun 2024 08:44:25 -0000 1.87
> +++ dev/pci/if_vmx.c 12 Jun 2024 14:34:11 -0000
> @@ -1619,6 +1619,8 @@ vmxnet3_start(struct ifqueue *ifq)
> rgen = ring->gen;
>
> for (;;) {
> + int hdrlen;
> +
> if (free <= NTXSEGS) {
> ifq_set_oactive(ifq);
> break;
> @@ -1627,6 +1629,29 @@ vmxnet3_start(struct ifqueue *ifq)
> m = ifq_dequeue(ifq);
> if (m == NULL)
> break;
> +
> + /*
> + * Headers for Ether, IP, TCP including options must lay in
> + * first mbuf to support TSO. Usually our stack gets that
> + * right. To avoid packet parsing here, make a rough estimate
> + * for simple IPv4. Cases seen in the wild contain only ether
> + * header in separate mbuf. To support IPv6 with TCP options,
> + * move as much as possible into first mbuf. Realloc mbuf
> + * before bus dma load.
> + */
> + hdrlen = sizeof(struct ether_header) + sizeof(struct ip) +
> + sizeof(struct tcphdr);
> + if (ISSET(m->m_pkthdr.csum_flags, M_TCP_TSO) &&
> + m->m_len < hdrlen && hdrlen <= m->m_pkthdr.len) {
> + hdrlen = MHLEN;
> + hdrlen -= mtod(m, unsigned long) & (sizeof(long) - 1);
I would clarify this line with an extra comment. Thus, non-hardcore
kernel hacker also know what it does and why its necessary.
ok jan@
> + if (hdrlen > m->m_pkthdr.len)
> + hdrlen = m->m_pkthdr.len;
> + if ((m = m_pullup(m, hdrlen)) == NULL) {
> + ifq->ifq_errors++;
> + continue;
> + }
> + }
>
> map = ring->dmap[prod];
>
>
vmx TSO pullup headers into first mbuf