From: Vitaliy Makkoveev <mvs@openbsd.org>
Subject: Re: ip6 forward mcopy
To: Alexander Bluhm <bluhm@openbsd.org>
Cc: tech@openbsd.org
Date: Fri, 12 Jul 2024 14:46:35 +0300

On Fri, Jul 12, 2024 at 11:56:31AM +0200, Alexander Bluhm wrote:
> On Mon, Jun 24, 2024 at 11:33:44AM +0200, Alexander Bluhm wrote:
> > Forwarding IPv6 packets is slower than IPv4.  Reason is the m_copym()
> > that is done for every packet.  Just in case we may have to send
> > an ICMP6 packet, ip6_forward() creates a mbuf copy.  After that
> > mbuf cluster is read only, so for the ethernet header another mbuf
> > is allocated.  pf NAT and RDR ignores readonly clusters, so it also
> > modifies the potential ICMP6 packet.
> > 
> > IPv4 ip_forward() avoids all these problems by copying the leading
> > 68 bytes of the original packets onto the stack.  More is not need
> > for ICMP.  IPv6 RFC requires up to 1232 bytes in the ICMP6 packet.
> > This cannot be copied to the stack.
> > 
> > The reason for the difference is that the ICMP6 packet has to contain
> > the full header chain.  I think if we have a simple UDP or TCP
> > packet without chain, we could do a shortcut and just copy the
> > header to ICMP6 packet.
> > 
> > Do we want to violate the RFC and gain 30% performance for the
> > common TCP/UDP case?
> 
> Part of the diff have been commited.  For small packets we use stack
> memory, large packets need extra mbuf allocation.
> 
> Following diff truncates ICMP6 packtes to a reasonable length if
> the orignal packets has a final protocol header directly after the
> IPv6 header.  My list contains TCP, UDP, ESP as they cover the
> common cases and anything behind it should not be needed for path
> MTU discovery.
> 
> OK to violate RFC for performance and simplicity?
> 

Still ok by me.

> bluhm
> 
> Index: netinet6/ip6_forward.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/ip6_forward.c,v
> diff -u -p -r1.121 ip6_forward.c
> --- netinet6/ip6_forward.c	9 Jul 2024 09:33:13 -0000	1.121
> +++ netinet6/ip6_forward.c	12 Jul 2024 09:27:23 -0000
> @@ -145,10 +145,33 @@ ip6_forward(struct mbuf *m, struct route
>  	 * Thanks to M_EXT, in most cases copy will not occur.
>  	 * For small packets copy original onto stack instead of mbuf.
>  	 *
> +	 * For final protocol header like TCP or UDP, full header chain in
> +	 * ICMP6 packet is not necessary.  In this case only copy small
> +	 * part of original packet and save it on stack instead of mbuf.
> +	 * Although this violates RFC 4443 2.4. (c), it avoids additional
> +	 * mbuf allocations.  Also pf nat and rdr do not affect the shared
> +	 * mbuf cluster.
> +	 *
>  	 * It is important to save it before IPsec processing as IPsec
>  	 * processing may modify the mbuf.
>  	 */
> -	icmp_len = min(m->m_pkthdr.len, ICMPV6_PLD_MAXLEN);
> +	switch (ip6->ip6_nxt) {
> +	case IPPROTO_TCP:
> +		icmp_len = sizeof(struct ip6_hdr) + sizeof(struct tcphdr) +
> +		    MAX_TCPOPTLEN;
> +		break;
> +	case IPPROTO_UDP:
> +		icmp_len = sizeof(struct ip6_hdr) + sizeof(struct udphdr);
> +		break;
> +	case IPPROTO_ESP:
> +		icmp_len = sizeof(struct ip6_hdr) + 2 * sizeof(u_int32_t);
> +		break;
> +	default:
> +		icmp_len = ICMPV6_PLD_MAXLEN;
> +		break;
> +	}
> +	if (icmp_len > m->m_pkthdr.len)
> +		icmp_len = m->m_pkthdr.len;
>  	if (icmp_len <= sizeof(icmp_buf)) {
>  		mflags = m->m_flags;
>  		pfflags = m->m_pkthdr.pf.flags;
>