From: Alexander Bluhm Subject: Re: ip6 forward mcopy To: tech@openbsd.org Date: Fri, 12 Jul 2024 11:56:31 +0200 On Mon, Jun 24, 2024 at 11:33:44AM +0200, Alexander Bluhm wrote: > Forwarding IPv6 packets is slower than IPv4. Reason is the m_copym() > that is done for every packet. Just in case we may have to send > an ICMP6 packet, ip6_forward() creates a mbuf copy. After that > mbuf cluster is read only, so for the ethernet header another mbuf > is allocated. pf NAT and RDR ignores readonly clusters, so it also > modifies the potential ICMP6 packet. > > IPv4 ip_forward() avoids all these problems by copying the leading > 68 bytes of the original packets onto the stack. More is not need > for ICMP. IPv6 RFC requires up to 1232 bytes in the ICMP6 packet. > This cannot be copied to the stack. > > The reason for the difference is that the ICMP6 packet has to contain > the full header chain. I think if we have a simple UDP or TCP > packet without chain, we could do a shortcut and just copy the > header to ICMP6 packet. > > Do we want to violate the RFC and gain 30% performance for the > common TCP/UDP case? Part of the diff have been commited. For small packets we use stack memory, large packets need extra mbuf allocation. Following diff truncates ICMP6 packtes to a reasonable length if the orignal packets has a final protocol header directly after the IPv6 header. My list contains TCP, UDP, ESP as they cover the common cases and anything behind it should not be needed for path MTU discovery. OK to violate RFC for performance and simplicity? bluhm Index: netinet6/ip6_forward.c =================================================================== RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/ip6_forward.c,v diff -u -p -r1.121 ip6_forward.c --- netinet6/ip6_forward.c 9 Jul 2024 09:33:13 -0000 1.121 +++ netinet6/ip6_forward.c 12 Jul 2024 09:27:23 -0000 @@ -145,10 +145,33 @@ ip6_forward(struct mbuf *m, struct route * Thanks to M_EXT, in most cases copy will not occur. * For small packets copy original onto stack instead of mbuf. * + * For final protocol header like TCP or UDP, full header chain in + * ICMP6 packet is not necessary. In this case only copy small + * part of original packet and save it on stack instead of mbuf. + * Although this violates RFC 4443 2.4. (c), it avoids additional + * mbuf allocations. Also pf nat and rdr do not affect the shared + * mbuf cluster. + * * It is important to save it before IPsec processing as IPsec * processing may modify the mbuf. */ - icmp_len = min(m->m_pkthdr.len, ICMPV6_PLD_MAXLEN); + switch (ip6->ip6_nxt) { + case IPPROTO_TCP: + icmp_len = sizeof(struct ip6_hdr) + sizeof(struct tcphdr) + + MAX_TCPOPTLEN; + break; + case IPPROTO_UDP: + icmp_len = sizeof(struct ip6_hdr) + sizeof(struct udphdr); + break; + case IPPROTO_ESP: + icmp_len = sizeof(struct ip6_hdr) + 2 * sizeof(u_int32_t); + break; + default: + icmp_len = ICMPV6_PLD_MAXLEN; + break; + } + if (icmp_len > m->m_pkthdr.len) + icmp_len = m->m_pkthdr.len; if (icmp_len <= sizeof(icmp_buf)) { mflags = m->m_flags; pfflags = m->m_pkthdr.pf.flags;