Download raw body.
route reject/blackhole semantics
On Mon, Jul 29, 2024 at 05:15:56PM +0200, Alexander Bluhm wrote:
> On Mon, Jul 29, 2024 at 09:59:22AM +1000, David Gwynne wrote:
> > every few years i am surprised to learn that the RTF_REJECT and
> > RTF_BLACKHOLE flags on routes are only really checked when going
> > "out" on loopback interfaces. my expectation is that these route
> > checks happen when the route lookup happens. it's especially
> > surprising that routes with reject/blackhole flags can make it into
> > pf.
>
> The lookup happens way before as an optmization. As we need to
> know whether an address is local, we do route lookup in IP input.
> But the reject info is needed when we decide whether we should use
> the route for output.
It is still crazy that the reject or blackhole happens inside lo(4) that
is later than necessary.
> > this diff moves the reject/blackhole route checks into ip_output before
> > pf_test is called.
>
> But this prevents that pf route-to can override a route. Does my
> pf_forward regress test still pass with this change? My understanding
> of route-to is that it can work independently of IP output routing
I think there should be a limit in what pf can screw around with. As an
example if you have no default route and no route to 10.0.0.1. Then
ip_forward will not pass the packet to pf_test in ip_output.
If you have a 10/8 reject or blackhole route suddenly 10.0.0.1 will show
up in pf_test in ip_output. Which is in my opinion very surprising and
wrong.
If you route to on an in rule it still works and it does in both cases.
> > i am not alone in my confusion. claudio@ seems keen for this change, but
> > also because he thinks it can help simplify handling of blackhole/reject
> > routing in things like bgpd, but mostly that it is confusing.
>
> BSD4.4 seems to use RTF_REJECT only for routes without valid ARP
> entry. Who know what he semantics of reject routes should be today.
If you run a router in the default free zone you need either some reject
or blackhole routes. I normaly use blackhole sind I see no point in
sending back ICMP errors to not existing net blocks.
> > because the stack checks those flags, lo(4) output doesnt have to
> > anymore. however, if you do want to "null route" a prefix so it usually
> > wont end up on the wire, but give pf a chance to fiddle with it, you can
> > still add a route via lo(4) for it. which leads to a bug fix.
>
> Is it still possible to divert-to an address that was set with
> SO_BINDANY? They do not have a RTF_LOCAL route. There are a lot
> of corner cases in this area that should be considered.
Guess we need to look into those.
> > you can add routes like this (even without the reject/blackhole diffs):
> >
> > # route add 100.64.0.0/24 127.0.0.1
> >
> > if you then ping 100.64.0.1, you'll end up producing a traffic loop
> > inside the kernel. each ping packet will loop through lo0 until the ttl
> > expires. i think lo ouput should check that the address it's delivering
> > to has RTF_LOCAL set, which is what the if_loop.c chunk of the diff
> > does.
> >
> > thoughts? tests? ok?
>
> Why did you not change ip6_forward()? Now it behaves differently
> than ip_forward() and ip6_output(). It is a typical pitfall that
> IPv4 forward calls IP output, but IPv6 forward calls interface
> output.
>
> To me this looks very risky. I assume that I will see fallout when
> our customers will install our product based on this diff in more
> than a year. Usually I write regress tests to be aware of changes
> and keep behavior.
>
> What are the benefits of the change? Apart that the behavior is
> differnt and that it confuses you less?
We do not fully process a packet to drop it on the floor. Which is a big
deal when you blackhole a route to sink an attack.
I also think the behaviour is a lot better and will allow to make pf
route-to and reply-to to behave less crazy. There is no more needs to
fixup ICMP packets from 127.0.0.1. I think there are numerous cases that
improve the situation and thing that break currently work mostly by luck.
> bluhm
>
> > Index: net/if_loop.c
> > ===================================================================
> > RCS file: /cvs/src/sys/net/if_loop.c,v
> > diff -u -p -r1.98 if_loop.c
> > --- net/if_loop.c 29 Dec 2023 11:43:04 -0000 1.98
> > +++ net/if_loop.c 28 Jul 2024 11:58:10 -0000
> > @@ -261,10 +261,10 @@ looutput(struct ifnet *ifp, struct mbuf
> > if ((m->m_flags & M_PKTHDR) == 0)
> > panic("%s: no header mbuf", __func__);
> >
> > - if (rt && rt->rt_flags & (RTF_REJECT|RTF_BLACKHOLE)) {
> > + if (rt != NULL && !ISSET(rt->rt_flags, RTF_LOCAL)) {
> > m_freem(m);
> > - return (rt->rt_flags & RTF_BLACKHOLE ? 0 :
> > - rt->rt_flags & RTF_HOST ? EHOSTUNREACH : ENETUNREACH);
> > + return (ISSET(rt->rt_flags, RTF_HOST) ?
> > + EHOSTUNREACH : ENETUNREACH);
> > }
> >
> > /*
> > Index: netinet/ip_output.c
> > ===================================================================
> > RCS file: /cvs/src/sys/netinet/ip_output.c,v
> > diff -u -p -r1.401 ip_output.c
> > --- netinet/ip_output.c 2 Jul 2024 18:33:47 -0000 1.401
> > +++ netinet/ip_output.c 28 Jul 2024 11:58:10 -0000
> > @@ -398,6 +398,20 @@ sendit:
> > }
> > #endif /* IPSEC */
> >
> > + if (ro && ro->ro_rt) {
> > + struct rtentry *rt = ro->ro_rt;
> > +
> > + if (ISSET(rt->rt_flags, RTF_REJECT)) {
> > + error = ISSET(rt->rt_flags, RTF_HOST) ?
> > + EHOSTUNREACH : ENETUNREACH;
> > + goto bad;
> > + }
> > + if (ISSET(rt->rt_flags, RTF_BLACKHOLE)) {
> > + error = 0;
> > + goto bad;
> > + }
> > + }
> > +
> > /*
> > * Packet filter
> > */
> > Index: netinet6/ip6_output.c
> > ===================================================================
> > RCS file: /cvs/src/sys/netinet6/ip6_output.c,v
> > diff -u -p -r1.292 ip6_output.c
> > --- netinet6/ip6_output.c 4 Jul 2024 12:50:08 -0000 1.292
> > +++ netinet6/ip6_output.c 28 Jul 2024 11:58:10 -0000
> > @@ -483,9 +483,20 @@ reroute:
> > route6_cache(ro, &ip6->ip6_dst, NULL, m->m_pkthdr.ph_rtableid);
> > }
> >
> > - if (rt && (rt->rt_flags & RTF_GATEWAY) &&
> > - !IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst))
> > - dst = satosin6(rt->rt_gateway);
> > + if (rt != NULL) {
> > + if (ISSET(rt->rt_flags, RTF_REJECT)) {
> > + error = ISSET(rt->rt_flags, RTF_HOST) ?
> > + EHOSTUNREACH : ENETUNREACH;
> > + goto bad;
> > + }
> > + if (ISSET(rt->rt_flags, RTF_BLACKHOLE)) {
> > + error = 0;
> > + goto bad;
> > + }
> > + if (ISSET(rt->rt_flags, RTF_GATEWAY) &&
> > + !IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst))
> > + dst = satosin6(rt->rt_gateway);
> > + }
> >
> > if (!IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) {
> > /* Unicast */
>
--
:wq Claudio
route reject/blackhole semantics