Download raw body.
ipv4 icmp_reflect() source address selection optimisation
On Thu, Jul 25, 2024 at 02:55:39PM +1000, David Gwynne wrote:
> this changes icmp_reflect() so it doesn't do work that ip_output is
> going to do for it anyway.
>
> if the original packet we need to send some icmp about was for us,
> we can send the icmp packet back to the sender from our local address.
> however, if we're a router we need to pick one of our own ips to send
> that reply from. currently icmp_reflect fills in the icmp packets
> src ip by doing a route lookup for the new dst address and using
> the local ip from the route entry.
>
> however, picking a local source address is not an icmp specific
> problem. other parts of the stack solve this problem by leaving the
> src ip unset (0.0.0.0) and ip_forward fixes it up using the address
> on the route entry made against the dst ip.
>
> ip_output has to do the route lookup anyway, so save a bit of effort by
> not doing it in icmp_reflect.
>
> this also has the benefit that there's one less place implementing
> this semantic, so there's less to fix up if we ever want to tweak
> source address selection based on route lookups.
>
> ok?
This change breaks regress/sys/netinet/ipsec.
But not IPsec itself is broken, just the behavior changes. I had
to add an explicit flow for ICMP to test path MTU. Now the source
address of the ICMP packet is different and the packet does not
match anymore.
Of course it would be easy to adapt the test. But is the new
behavior better? You code selects the first address of the interface
where the original packet arrived. Before the address associated
with the route was used. In IPv4 we do source address selection
based on routing table. Using the local IP address of the route
the ICMP packet goes to or the original packet came from makes more
sense to me than to select the first address of the interface.
bluhm
> Index: ip_icmp.c
> ===================================================================
> RCS file: /cvs/src/sys/netinet/ip_icmp.c,v
> diff -u -p -r1.196 ip_icmp.c
> --- ip_icmp.c 14 Jul 2024 18:53:39 -0000 1.196
> +++ ip_icmp.c 25 Jul 2024 04:53:21 -0000
> @@ -684,7 +684,7 @@ icmp_reflect(struct mbuf *m, struct mbuf
> struct ip *ip = mtod(m, struct ip *);
> struct mbuf *opts = NULL;
> struct sockaddr_in sin;
> - struct rtentry *rt = NULL;
> + struct in_addr ip_src = { INADDR_ANY };
> int optlen = (ip->ip_hl << 2) - sizeof(struct ip);
> u_int rtableid;
> u_int8_t pfflags;
> @@ -710,49 +710,33 @@ icmp_reflect(struct mbuf *m, struct mbuf
> * If the incoming packet was addressed directly to us,
> * use dst as the src for the reply. For broadcast, use
> * the address which corresponds to the incoming interface.
> + * ip_output() will fix up ip_src if it's still 0.0.0.0
> + * after this.
> */
> if (ia == NULL) {
> + struct rtentry *rt;
> +
> memset(&sin, 0, sizeof(sin));
> sin.sin_len = sizeof(sin);
> sin.sin_family = AF_INET;
> sin.sin_addr = ip->ip_dst;
>
> rt = rtalloc(sintosa(&sin), 0, rtableid);
> - if (rtisvalid(rt) &&
> - ISSET(rt->rt_flags, RTF_LOCAL|RTF_BROADCAST))
> - ia = ifatoia(rt->rt_ifa);
> - }
> -
> - /*
> - * The following happens if the packet was not addressed to us.
> - * Use the new source address and do a route lookup. If it fails
> - * drop the packet as there is no path to the host.
> - */
> - if (ia == NULL) {
> - rtfree(rt);
> -
> - memset(&sin, 0, sizeof(sin));
> - sin.sin_len = sizeof(sin);
> - sin.sin_family = AF_INET;
> - sin.sin_addr = ip->ip_src;
> -
> - /* keep packet in the original virtual instance */
> - rt = rtalloc(sintosa(&sin), RT_RESOLVE, rtableid);
> - if (rt == NULL) {
> - ipstat_inc(ips_noroute);
> - m_freem(m);
> - return (EHOSTUNREACH);
> + if (rtisvalid(rt)) {
> + if (ISSET(rt->rt_flags, RTF_LOCAL))
> + ip_src = ip->ip_dst;
> + else if (ISSET(rt->rt_flags, RTF_BROADCAST)) {
> + ia = ifatoia(rt->rt_ifa);
> + ip_src = ia->ia_addr.sin_addr;
> + }
> }
> -
> - ia = ifatoia(rt->rt_ifa);
> - }
> + rtfree(rt);
> + } else
> + ip_src = ia->ia_addr.sin_addr;
>
> ip->ip_dst = ip->ip_src;
> + ip->ip_src = ip_src;
> ip->ip_ttl = MAXTTL;
> -
> - /* It is safe to dereference ``ia'' iff ``rt'' is valid. */
> - ip->ip_src = ia->ia_addr.sin_addr;
> - rtfree(rt);
>
> if (optlen > 0) {
> u_char *cp;
ipv4 icmp_reflect() source address selection optimisation