From: Vitaliy Makkoveev Subject: Re: ip6 forward mcopy To: Alexander Bluhm Cc: tech@openbsd.org Date: Fri, 12 Jul 2024 14:46:35 +0300 On Fri, Jul 12, 2024 at 11:56:31AM +0200, Alexander Bluhm wrote: > On Mon, Jun 24, 2024 at 11:33:44AM +0200, Alexander Bluhm wrote: > > Forwarding IPv6 packets is slower than IPv4. Reason is the m_copym() > > that is done for every packet. Just in case we may have to send > > an ICMP6 packet, ip6_forward() creates a mbuf copy. After that > > mbuf cluster is read only, so for the ethernet header another mbuf > > is allocated. pf NAT and RDR ignores readonly clusters, so it also > > modifies the potential ICMP6 packet. > > > > IPv4 ip_forward() avoids all these problems by copying the leading > > 68 bytes of the original packets onto the stack. More is not need > > for ICMP. IPv6 RFC requires up to 1232 bytes in the ICMP6 packet. > > This cannot be copied to the stack. > > > > The reason for the difference is that the ICMP6 packet has to contain > > the full header chain. I think if we have a simple UDP or TCP > > packet without chain, we could do a shortcut and just copy the > > header to ICMP6 packet. > > > > Do we want to violate the RFC and gain 30% performance for the > > common TCP/UDP case? > > Part of the diff have been commited. For small packets we use stack > memory, large packets need extra mbuf allocation. > > Following diff truncates ICMP6 packtes to a reasonable length if > the orignal packets has a final protocol header directly after the > IPv6 header. My list contains TCP, UDP, ESP as they cover the > common cases and anything behind it should not be needed for path > MTU discovery. > > OK to violate RFC for performance and simplicity? > Still ok by me. > bluhm > > Index: netinet6/ip6_forward.c > =================================================================== > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/ip6_forward.c,v > diff -u -p -r1.121 ip6_forward.c > --- netinet6/ip6_forward.c 9 Jul 2024 09:33:13 -0000 1.121 > +++ netinet6/ip6_forward.c 12 Jul 2024 09:27:23 -0000 > @@ -145,10 +145,33 @@ ip6_forward(struct mbuf *m, struct route > * Thanks to M_EXT, in most cases copy will not occur. > * For small packets copy original onto stack instead of mbuf. > * > + * For final protocol header like TCP or UDP, full header chain in > + * ICMP6 packet is not necessary. In this case only copy small > + * part of original packet and save it on stack instead of mbuf. > + * Although this violates RFC 4443 2.4. (c), it avoids additional > + * mbuf allocations. Also pf nat and rdr do not affect the shared > + * mbuf cluster. > + * > * It is important to save it before IPsec processing as IPsec > * processing may modify the mbuf. > */ > - icmp_len = min(m->m_pkthdr.len, ICMPV6_PLD_MAXLEN); > + switch (ip6->ip6_nxt) { > + case IPPROTO_TCP: > + icmp_len = sizeof(struct ip6_hdr) + sizeof(struct tcphdr) + > + MAX_TCPOPTLEN; > + break; > + case IPPROTO_UDP: > + icmp_len = sizeof(struct ip6_hdr) + sizeof(struct udphdr); > + break; > + case IPPROTO_ESP: > + icmp_len = sizeof(struct ip6_hdr) + 2 * sizeof(u_int32_t); > + break; > + default: > + icmp_len = ICMPV6_PLD_MAXLEN; > + break; > + } > + if (icmp_len > m->m_pkthdr.len) > + icmp_len = m->m_pkthdr.len; > if (icmp_len <= sizeof(icmp_buf)) { > mflags = m->m_flags; > pfflags = m->m_pkthdr.pf.flags; >