From: Alexander Bluhm Subject: Re: route reject/blackhole semantics To: David Gwynne Cc: tech@openbsd.org Date: Mon, 29 Jul 2024 17:15:56 +0200 On Mon, Jul 29, 2024 at 09:59:22AM +1000, David Gwynne wrote: > every few years i am surprised to learn that the RTF_REJECT and > RTF_BLACKHOLE flags on routes are only really checked when going > "out" on loopback interfaces. my expectation is that these route > checks happen when the route lookup happens. it's especially > surprising that routes with reject/blackhole flags can make it into > pf. The lookup happens way before as an optmization. As we need to know whether an address is local, we do route lookup in IP input. But the reject info is needed when we decide whether we should use the route for output. > this diff moves the reject/blackhole route checks into ip_output before > pf_test is called. But this prevents that pf route-to can override a route. Does my pf_forward regress test still pass with this change? My understanding of route-to is that it can work independently of IP output routing > i am not alone in my confusion. claudio@ seems keen for this change, but > also because he thinks it can help simplify handling of blackhole/reject > routing in things like bgpd, but mostly that it is confusing. BSD4.4 seems to use RTF_REJECT only for routes without valid ARP entry. Who know what he semantics of reject routes should be today. > because the stack checks those flags, lo(4) output doesnt have to > anymore. however, if you do want to "null route" a prefix so it usually > wont end up on the wire, but give pf a chance to fiddle with it, you can > still add a route via lo(4) for it. which leads to a bug fix. Is it still possible to divert-to an address that was set with SO_BINDANY? They do not have a RTF_LOCAL route. There are a lot of corner cases in this area that should be considered. > you can add routes like this (even without the reject/blackhole diffs): > > # route add 100.64.0.0/24 127.0.0.1 > > if you then ping 100.64.0.1, you'll end up producing a traffic loop > inside the kernel. each ping packet will loop through lo0 until the ttl > expires. i think lo ouput should check that the address it's delivering > to has RTF_LOCAL set, which is what the if_loop.c chunk of the diff > does. > > thoughts? tests? ok? Why did you not change ip6_forward()? Now it behaves differently than ip_forward() and ip6_output(). It is a typical pitfall that IPv4 forward calls IP output, but IPv6 forward calls interface output. To me this looks very risky. I assume that I will see fallout when our customers will install our product based on this diff in more than a year. Usually I write regress tests to be aware of changes and keep behavior. What are the benefits of the change? Apart that the behavior is differnt and that it confuses you less? bluhm > Index: net/if_loop.c > =================================================================== > RCS file: /cvs/src/sys/net/if_loop.c,v > diff -u -p -r1.98 if_loop.c > --- net/if_loop.c 29 Dec 2023 11:43:04 -0000 1.98 > +++ net/if_loop.c 28 Jul 2024 11:58:10 -0000 > @@ -261,10 +261,10 @@ looutput(struct ifnet *ifp, struct mbuf > if ((m->m_flags & M_PKTHDR) == 0) > panic("%s: no header mbuf", __func__); > > - if (rt && rt->rt_flags & (RTF_REJECT|RTF_BLACKHOLE)) { > + if (rt != NULL && !ISSET(rt->rt_flags, RTF_LOCAL)) { > m_freem(m); > - return (rt->rt_flags & RTF_BLACKHOLE ? 0 : > - rt->rt_flags & RTF_HOST ? EHOSTUNREACH : ENETUNREACH); > + return (ISSET(rt->rt_flags, RTF_HOST) ? > + EHOSTUNREACH : ENETUNREACH); > } > > /* > Index: netinet/ip_output.c > =================================================================== > RCS file: /cvs/src/sys/netinet/ip_output.c,v > diff -u -p -r1.401 ip_output.c > --- netinet/ip_output.c 2 Jul 2024 18:33:47 -0000 1.401 > +++ netinet/ip_output.c 28 Jul 2024 11:58:10 -0000 > @@ -398,6 +398,20 @@ sendit: > } > #endif /* IPSEC */ > > + if (ro && ro->ro_rt) { > + struct rtentry *rt = ro->ro_rt; > + > + if (ISSET(rt->rt_flags, RTF_REJECT)) { > + error = ISSET(rt->rt_flags, RTF_HOST) ? > + EHOSTUNREACH : ENETUNREACH; > + goto bad; > + } > + if (ISSET(rt->rt_flags, RTF_BLACKHOLE)) { > + error = 0; > + goto bad; > + } > + } > + > /* > * Packet filter > */ > Index: netinet6/ip6_output.c > =================================================================== > RCS file: /cvs/src/sys/netinet6/ip6_output.c,v > diff -u -p -r1.292 ip6_output.c > --- netinet6/ip6_output.c 4 Jul 2024 12:50:08 -0000 1.292 > +++ netinet6/ip6_output.c 28 Jul 2024 11:58:10 -0000 > @@ -483,9 +483,20 @@ reroute: > route6_cache(ro, &ip6->ip6_dst, NULL, m->m_pkthdr.ph_rtableid); > } > > - if (rt && (rt->rt_flags & RTF_GATEWAY) && > - !IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) > - dst = satosin6(rt->rt_gateway); > + if (rt != NULL) { > + if (ISSET(rt->rt_flags, RTF_REJECT)) { > + error = ISSET(rt->rt_flags, RTF_HOST) ? > + EHOSTUNREACH : ENETUNREACH; > + goto bad; > + } > + if (ISSET(rt->rt_flags, RTF_BLACKHOLE)) { > + error = 0; > + goto bad; > + } > + if (ISSET(rt->rt_flags, RTF_GATEWAY) && > + !IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) > + dst = satosin6(rt->rt_gateway); > + } > > if (!IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) { > /* Unicast */