Index | Thread | Search

From:
Kristof Provost <kp@FreeBSD.org>
Subject:
pf af-to breaks traceroute
To:
tech@openbsd.org
Date:
Tue, 25 Feb 2025 10:40:27 +0100

Download raw body.

Thread
Hi,

Following a FreeBSD bug report 
(https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=284944) I’ve been 
investigating issues with traceroute through a nat64 (i.e. af-to) setup.

This appears to also affect OpenBSD pf.

The bug is that when we’re handling an ICMP packet we look up the 
state for the embedded packet and then while performing afto we copy the 
state’s source address to pf->nsaddr. This is incorrect, because the 
source of the outer packet is not the same as the address of the 
embedded packet that generated the ICMP error in the first place.

Specifically this line (although the same applies to other protocols as 
well):
https://github.com/openbsd/src/blob/924e3373aa78c8e139fb57fe722f0e7717258538/sys/net/pf.c#L5935

The easiest way to reproduce this is to run a traceroute through a pf 
af-to setup. This produces results like this:

	# traceroute6 64:ff9b::1.1.1.1
	traceroute6 to 64:ff9b::1.1.1.1 (64:ff9b::101:101) from 
2a00:1098:6b:200::1, 64 hops max, 28 byte packets
	 1  uk-myb-1.le-fay.org (2a00:1098:6b:100::1)  0.544 ms  0.411 ms  
0.305 ms
	 2  uk-aai-1.le-fay.org (2001:8b0:aab5:100::1)  6.738 ms  7.558 ms  
7.520 ms
	 3  64:ff9b::101:101 (64:ff9b::101:101)  12.666 ms  12.443 ms  11.981 
ms
	 4  64:ff9b::101:101 (64:ff9b::101:101)  12.904 ms  11.460 ms  13.006 
ms
	 5  64:ff9b::101:101 (64:ff9b::101:101)  14.095 ms  13.377 ms  13.012 
ms
	 6  64:ff9b::101:101 (64:ff9b::101:101)  12.984 ms  13.523 ms  14.175 
ms
	 7  64:ff9b::101:101 (64:ff9b::101:101)  13.939 ms  13.436 ms  13.025 
ms

I’m currently testing a patch along these lines (for FreeBSD):

	diff --git a/sys/netpfil/pf/pf.c b/sys/netpfil/pf/pf.c
	index f3c9ea7a2fb1..ac4bab45ffda 100644
	--- a/sys/netpfil/pf/pf.c
	+++ b/sys/netpfil/pf/pf.c
	@@ -8109,8 +8109,18 @@ pf_test_state_icmp(struct pf_kstate **state, 
struct pf_pdesc *pd,
	                                            nk->port[didx], 1, pd->af, 
nk->af);
	                                        m_copyback(pd2.m, pd2.off, 
sizeof(uh),
	                                            (c_caddr_t)&uh);
	-                                       PF_ACPY(&pd->nsaddr,
	-                                           &nk->addr[pd2.sidx], 
nk->af);
	+                                       if (pd->af == AF_INET) {
	+                                               struct pf_addr prefix, 
nsaddr;
	+                                               int prefixlen = 
in6_mask2len(
	+                                                   (struct in6_addr 
*)&(*state)->rule->dst.addr.v.a.mask, NULL);
	+                                               if (prefixlen < 32)
	+                                                       prefixlen = 96;
	+                                               PF_ACPY(&prefix, 
&nk->addr[pd2.sidx], nk->af);
	+                                               PF_ACPY(&nsaddr, 
pd->src, pd->af);
	+                                               inet_nat64(AF_INET6, 
pd->src, &nsaddr, &prefix,
	+                                                   prefixlen);
	+                                               PF_ACPY(&pd->nsaddr, 
&nsaddr, AF_INET6);
	+                                       }
	                                        PF_ACPY(&pd->ndaddr,
	                                            &nk->addr[pd2.didx], 
nk->af);
	                                        pd->naf = nk->af;

(That’s very rough first draft, but at least shows the intent. Among 
other things it must also be applied to the TCP/UDP/SCTP handling code 
in pf_test_state_icmp().)

Best regards,
Kristof