From: Kristof Provost Subject: pf af-to breaks traceroute To: tech@openbsd.org Date: Tue, 25 Feb 2025 10:40:27 +0100 Hi, Following a FreeBSD bug report (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=284944) I’ve been investigating issues with traceroute through a nat64 (i.e. af-to) setup. This appears to also affect OpenBSD pf. The bug is that when we’re handling an ICMP packet we look up the state for the embedded packet and then while performing afto we copy the state’s source address to pf->nsaddr. This is incorrect, because the source of the outer packet is not the same as the address of the embedded packet that generated the ICMP error in the first place. Specifically this line (although the same applies to other protocols as well): https://github.com/openbsd/src/blob/924e3373aa78c8e139fb57fe722f0e7717258538/sys/net/pf.c#L5935 The easiest way to reproduce this is to run a traceroute through a pf af-to setup. This produces results like this: # traceroute6 64:ff9b::1.1.1.1 traceroute6 to 64:ff9b::1.1.1.1 (64:ff9b::101:101) from 2a00:1098:6b:200::1, 64 hops max, 28 byte packets 1 uk-myb-1.le-fay.org (2a00:1098:6b:100::1) 0.544 ms 0.411 ms 0.305 ms 2 uk-aai-1.le-fay.org (2001:8b0:aab5:100::1) 6.738 ms 7.558 ms 7.520 ms 3 64:ff9b::101:101 (64:ff9b::101:101) 12.666 ms 12.443 ms 11.981 ms 4 64:ff9b::101:101 (64:ff9b::101:101) 12.904 ms 11.460 ms 13.006 ms 5 64:ff9b::101:101 (64:ff9b::101:101) 14.095 ms 13.377 ms 13.012 ms 6 64:ff9b::101:101 (64:ff9b::101:101) 12.984 ms 13.523 ms 14.175 ms 7 64:ff9b::101:101 (64:ff9b::101:101) 13.939 ms 13.436 ms 13.025 ms I’m currently testing a patch along these lines (for FreeBSD): diff --git a/sys/netpfil/pf/pf.c b/sys/netpfil/pf/pf.c index f3c9ea7a2fb1..ac4bab45ffda 100644 --- a/sys/netpfil/pf/pf.c +++ b/sys/netpfil/pf/pf.c @@ -8109,8 +8109,18 @@ pf_test_state_icmp(struct pf_kstate **state, struct pf_pdesc *pd, nk->port[didx], 1, pd->af, nk->af); m_copyback(pd2.m, pd2.off, sizeof(uh), (c_caddr_t)&uh); - PF_ACPY(&pd->nsaddr, - &nk->addr[pd2.sidx], nk->af); + if (pd->af == AF_INET) { + struct pf_addr prefix, nsaddr; + int prefixlen = in6_mask2len( + (struct in6_addr *)&(*state)->rule->dst.addr.v.a.mask, NULL); + if (prefixlen < 32) + prefixlen = 96; + PF_ACPY(&prefix, &nk->addr[pd2.sidx], nk->af); + PF_ACPY(&nsaddr, pd->src, pd->af); + inet_nat64(AF_INET6, pd->src, &nsaddr, &prefix, + prefixlen); + PF_ACPY(&pd->nsaddr, &nsaddr, AF_INET6); + } PF_ACPY(&pd->ndaddr, &nk->addr[pd2.didx], nk->af); pd->naf = nk->af; (That’s very rough first draft, but at least shows the intent. Among other things it must also be applied to the TCP/UDP/SCTP handling code in pf_test_state_icmp().) Best regards, Kristof