From: Claudio Jeker Subject: Re: route reject/blackhole semantics To: Alexander Bluhm Cc: David Gwynne , tech@openbsd.org Date: Tue, 30 Jul 2024 08:36:11 +0200 On Mon, Jul 29, 2024 at 05:15:56PM +0200, Alexander Bluhm wrote: > On Mon, Jul 29, 2024 at 09:59:22AM +1000, David Gwynne wrote: > > every few years i am surprised to learn that the RTF_REJECT and > > RTF_BLACKHOLE flags on routes are only really checked when going > > "out" on loopback interfaces. my expectation is that these route > > checks happen when the route lookup happens. it's especially > > surprising that routes with reject/blackhole flags can make it into > > pf. > > The lookup happens way before as an optmization. As we need to > know whether an address is local, we do route lookup in IP input. > But the reject info is needed when we decide whether we should use > the route for output. It is still crazy that the reject or blackhole happens inside lo(4) that is later than necessary. > > this diff moves the reject/blackhole route checks into ip_output before > > pf_test is called. > > But this prevents that pf route-to can override a route. Does my > pf_forward regress test still pass with this change? My understanding > of route-to is that it can work independently of IP output routing I think there should be a limit in what pf can screw around with. As an example if you have no default route and no route to 10.0.0.1. Then ip_forward will not pass the packet to pf_test in ip_output. If you have a 10/8 reject or blackhole route suddenly 10.0.0.1 will show up in pf_test in ip_output. Which is in my opinion very surprising and wrong. If you route to on an in rule it still works and it does in both cases. > > i am not alone in my confusion. claudio@ seems keen for this change, but > > also because he thinks it can help simplify handling of blackhole/reject > > routing in things like bgpd, but mostly that it is confusing. > > BSD4.4 seems to use RTF_REJECT only for routes without valid ARP > entry. Who know what he semantics of reject routes should be today. If you run a router in the default free zone you need either some reject or blackhole routes. I normaly use blackhole sind I see no point in sending back ICMP errors to not existing net blocks. > > because the stack checks those flags, lo(4) output doesnt have to > > anymore. however, if you do want to "null route" a prefix so it usually > > wont end up on the wire, but give pf a chance to fiddle with it, you can > > still add a route via lo(4) for it. which leads to a bug fix. > > Is it still possible to divert-to an address that was set with > SO_BINDANY? They do not have a RTF_LOCAL route. There are a lot > of corner cases in this area that should be considered. Guess we need to look into those. > > you can add routes like this (even without the reject/blackhole diffs): > > > > # route add 100.64.0.0/24 127.0.0.1 > > > > if you then ping 100.64.0.1, you'll end up producing a traffic loop > > inside the kernel. each ping packet will loop through lo0 until the ttl > > expires. i think lo ouput should check that the address it's delivering > > to has RTF_LOCAL set, which is what the if_loop.c chunk of the diff > > does. > > > > thoughts? tests? ok? > > Why did you not change ip6_forward()? Now it behaves differently > than ip_forward() and ip6_output(). It is a typical pitfall that > IPv4 forward calls IP output, but IPv6 forward calls interface > output. > > To me this looks very risky. I assume that I will see fallout when > our customers will install our product based on this diff in more > than a year. Usually I write regress tests to be aware of changes > and keep behavior. > > What are the benefits of the change? Apart that the behavior is > differnt and that it confuses you less? We do not fully process a packet to drop it on the floor. Which is a big deal when you blackhole a route to sink an attack. I also think the behaviour is a lot better and will allow to make pf route-to and reply-to to behave less crazy. There is no more needs to fixup ICMP packets from 127.0.0.1. I think there are numerous cases that improve the situation and thing that break currently work mostly by luck. > bluhm > > > Index: net/if_loop.c > > =================================================================== > > RCS file: /cvs/src/sys/net/if_loop.c,v > > diff -u -p -r1.98 if_loop.c > > --- net/if_loop.c 29 Dec 2023 11:43:04 -0000 1.98 > > +++ net/if_loop.c 28 Jul 2024 11:58:10 -0000 > > @@ -261,10 +261,10 @@ looutput(struct ifnet *ifp, struct mbuf > > if ((m->m_flags & M_PKTHDR) == 0) > > panic("%s: no header mbuf", __func__); > > > > - if (rt && rt->rt_flags & (RTF_REJECT|RTF_BLACKHOLE)) { > > + if (rt != NULL && !ISSET(rt->rt_flags, RTF_LOCAL)) { > > m_freem(m); > > - return (rt->rt_flags & RTF_BLACKHOLE ? 0 : > > - rt->rt_flags & RTF_HOST ? EHOSTUNREACH : ENETUNREACH); > > + return (ISSET(rt->rt_flags, RTF_HOST) ? > > + EHOSTUNREACH : ENETUNREACH); > > } > > > > /* > > Index: netinet/ip_output.c > > =================================================================== > > RCS file: /cvs/src/sys/netinet/ip_output.c,v > > diff -u -p -r1.401 ip_output.c > > --- netinet/ip_output.c 2 Jul 2024 18:33:47 -0000 1.401 > > +++ netinet/ip_output.c 28 Jul 2024 11:58:10 -0000 > > @@ -398,6 +398,20 @@ sendit: > > } > > #endif /* IPSEC */ > > > > + if (ro && ro->ro_rt) { > > + struct rtentry *rt = ro->ro_rt; > > + > > + if (ISSET(rt->rt_flags, RTF_REJECT)) { > > + error = ISSET(rt->rt_flags, RTF_HOST) ? > > + EHOSTUNREACH : ENETUNREACH; > > + goto bad; > > + } > > + if (ISSET(rt->rt_flags, RTF_BLACKHOLE)) { > > + error = 0; > > + goto bad; > > + } > > + } > > + > > /* > > * Packet filter > > */ > > Index: netinet6/ip6_output.c > > =================================================================== > > RCS file: /cvs/src/sys/netinet6/ip6_output.c,v > > diff -u -p -r1.292 ip6_output.c > > --- netinet6/ip6_output.c 4 Jul 2024 12:50:08 -0000 1.292 > > +++ netinet6/ip6_output.c 28 Jul 2024 11:58:10 -0000 > > @@ -483,9 +483,20 @@ reroute: > > route6_cache(ro, &ip6->ip6_dst, NULL, m->m_pkthdr.ph_rtableid); > > } > > > > - if (rt && (rt->rt_flags & RTF_GATEWAY) && > > - !IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) > > - dst = satosin6(rt->rt_gateway); > > + if (rt != NULL) { > > + if (ISSET(rt->rt_flags, RTF_REJECT)) { > > + error = ISSET(rt->rt_flags, RTF_HOST) ? > > + EHOSTUNREACH : ENETUNREACH; > > + goto bad; > > + } > > + if (ISSET(rt->rt_flags, RTF_BLACKHOLE)) { > > + error = 0; > > + goto bad; > > + } > > + if (ISSET(rt->rt_flags, RTF_GATEWAY) && > > + !IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) > > + dst = satosin6(rt->rt_gateway); > > + } > > > > if (!IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) { > > /* Unicast */ > -- :wq Claudio