Index | Thread | Search

From:
Alexander Bluhm <alexander.bluhm@gmx.net>
Subject:
Re: route cache mpath
To:
tech@openbsd.org
Date:
Mon, 26 Feb 2024 20:32:31 +0100

Download raw body.

Thread
On Mon, Feb 26, 2024 at 05:22:50PM +0100, Claudio Jeker wrote:
> On Mon, Feb 26, 2024 at 03:51:37PM +0100, Alexander Bluhm wrote:
> > On Mon, Feb 26, 2024 at 11:56:08AM +0100, Claudio Jeker wrote:
> > > On Mon, Feb 26, 2024 at 11:44:57AM +0100, Alexander Bluhm wrote:
> > > > Hi,
> > > > 
> > > > We can combine route_cache() and rtalloc_mpath() in a new route_mpath()
> > > > function.  Then the caller does not have to care about the uint32_t
> > > > *src pointer and just pass struct in_addr.  All the conversions are
> > > > done inside the functions.  ro->ro_rt is either valid or NULL.
> > > > 
> > > > Also IP input should pass a struct route to IP forward.  This is
> > > > the same logic that is done when passing a route from IP forward
> > > > to IP output.  As a result the numbers of route cache lookups in
> > > > netstat -s should be correct now.
> > > > 
> > > > Finally I removed some inconsistencies between IPv4 and IPv4 and
> > > > IP forward and IP output.
> > > > 
> > > > ok?
> > > 
> > > I would prefer if rtalloc_mpath() is fixed and just takes a struct route *
> > > as argument. This uint32_t * in rtalloc_mpath() and rtable_match() needs
> > > to be changed. Passing the source as a uint32_t * is bad in many regards.
> > 
> > Do you have a diff to explain what you want?
> > 
> > route6_mpath() takes struct in_addr and writes it into struct route.
> > What is bad about this design?
> 
> Nothing. I wonder if we need yet another wrapper around a wrapper so that
> another wrapper can wrap it into a wrapper before passing it the function
> doing the lookup.
> 
> If rtalloc_mpath() would be as simple as
>    struct rtentry *rtalloc_mpath(const struct route *);
> then a lot of this would just go away since it is not needed.
> Actually most of the code of route_mpath() and route6_mpath() could be
> collapsed into rtalloc_mpath() (or vice versa).

We can get rid of rtalloc_mpath() later.  There are only 3 callers
left.  pf route-to does not use struct route yet.  IP output uses
route cache without route lookup for multicast.  Looks wrong, has
to be investigated.

After that fixed, we can remove rtalloc_mpath() and call rt_match()
from route_mpath().  This will remove the wrapper.

> Btw. I think the check to set 's' in route6_mpath() is reversed.

Good find, the ! was missing.  As multipath with IPv6 is broken
anyway, test could not find it.

> > We might get rid of rtalloc_mpath().  But then we have to use route
> > cache everywhere.  Rewriting the uint32_t * mess in the routing
> > code is not on my agenda.  I try to improve things in small steps.
> > Moving part of rtalloc_mpath() calls to route6_mpath() makes it
> > better.  Are route6_mpath() parameter the arguments you wish?
> 
> > > I like the changes to ip_forward and ip6_forward and the general trend to
> > > make code more consistent.
> > 
> > Splitting the diff to change ip_forward() first is hard.  The API
> > of route6_mpath() or rtalloc_mpath() decides the way how consistent
> > code should look like.
> > 
> > How do we proceed?  I don't understand how the code you wish looks.
> 
> Not against moving on with this but ideally there will be a few more steps
> after this that slowly make rtalloc() and rtalloc_mpath() better functions.
> As I said I like the trajectory of this work since it does a lot of good
> cleanup at higher levels.
>  
> > > > Or should I split the diff in smaller pieces?

To make review easier I can split introduction of route_mpath()
into smaller diff.  This contains easy callers in in_pcb.

Note that I removed useless rt != NULL check in in6_selectif().

ok?

bluhm

Index: net/route.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/net/route.c,v
diff -u -p -r1.433 route.c
--- net/route.c	22 Feb 2024 14:25:58 -0000	1.433
+++ net/route.c	26 Feb 2024 18:19:42 -0000
@@ -239,6 +239,28 @@ route_cache(struct route *ro, const stru
 	return (ESRCH);
 }
 
+/*
+ * Check cache for route, else allocate a new one, potentially using multipath
+ * to select the peer.  Update cache and return valid route or NULL.
+ */
+struct rtentry *
+route_mpath(struct route *ro, const struct in_addr *dst,
+    const struct in_addr *src, u_int rtableid)
+{
+	if (route_cache(ro, dst, src, rtableid)) {
+		uint32_t *s = NULL;
+
+		if (ro->ro_srcin.s_addr != INADDR_ANY)
+			s = &ro->ro_srcin.s_addr;
+		ro->ro_rt = rtalloc_mpath(&ro->ro_dstsa, s, ro->ro_tableid);
+		if (!rtisvalid(ro->ro_rt)) {
+			rtfree(ro->ro_rt);
+			ro->ro_rt = NULL;
+		}
+	}
+	return (ro->ro_rt);
+}
+
 #ifdef INET6
 int
 route6_cache(struct route *ro, const struct in6_addr *dst,
@@ -276,6 +298,24 @@ route6_cache(struct route *ro, const str
 		ro->ro_srcin6 = *src;
 
 	return (ESRCH);
+}
+
+struct rtentry *
+route6_mpath(struct route *ro, const struct in6_addr *dst,
+    const struct in6_addr *src, u_int rtableid)
+{
+	if (route6_cache(ro, dst, src, rtableid)) {
+		uint32_t *s = NULL;
+
+		if (!IN6_IS_ADDR_UNSPECIFIED(&ro->ro_srcin6))
+			s = &ro->ro_srcin6.s6_addr32[0];
+		ro->ro_rt = rtalloc_mpath(&ro->ro_dstsa, s, ro->ro_tableid);
+		if (!rtisvalid(ro->ro_rt)) {
+			rtfree(ro->ro_rt);
+			ro->ro_rt = NULL;
+		}
+	}
+	return (ro->ro_rt);
 }
 #endif
 
Index: net/route.h
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/net/route.h,v
diff -u -p -r1.207 route.h
--- net/route.h	22 Feb 2024 14:25:58 -0000	1.207
+++ net/route.h	26 Feb 2024 18:19:42 -0000
@@ -465,7 +465,11 @@ struct bfd_config;
 void	 route_init(void);
 int	 route_cache(struct route *, const struct in_addr *,
 	    const struct in_addr *, u_int);
+struct rtentry *route_mpath(struct route *, const struct in_addr *,
+	    const struct in_addr *, u_int);
 int	 route6_cache(struct route *, const struct in6_addr *,
+	    const struct in6_addr *, u_int);
+struct rtentry *route6_mpath(struct route *, const struct in6_addr *,
 	    const struct in6_addr *, u_int);
 void	 rtm_ifchg(struct ifnet *);
 void	 rtm_ifannounce(struct ifnet *, int);
Index: netinet/in_pcb.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/in_pcb.c,v
diff -u -p -r1.294 in_pcb.c
--- netinet/in_pcb.c	22 Feb 2024 14:25:58 -0000	1.294
+++ netinet/in_pcb.c	26 Feb 2024 18:19:42 -0000
@@ -908,23 +908,15 @@ in_pcblookup_local_lock(struct inpcbtabl
 struct rtentry *
 in_pcbrtentry(struct inpcb *inp)
 {
-	struct route *ro;
-
 #ifdef INET6
 	if (ISSET(inp->inp_flags, INP_IPV6))
 		return in6_pcbrtentry(inp);
 #endif
 
-	ro = &inp->inp_route;
-
 	if (inp->inp_faddr.s_addr == INADDR_ANY)
 		return (NULL);
-	if (route_cache(ro, &inp->inp_faddr, &inp->inp_laddr,
-	    inp->inp_rtableid)) {
-		ro->ro_rt = rtalloc_mpath(&ro->ro_dstsa,
-		    &inp->inp_laddr.s_addr, ro->ro_tableid);
-	}
-	return (ro->ro_rt);
+	return (route_mpath(&inp->inp_route, &inp->inp_faddr, &inp->inp_laddr,
+	    inp->inp_rtableid));
 }
 
 /*
@@ -938,7 +930,7 @@ in_pcbselsrc(struct in_addr *insrc, stru
     struct inpcb *inp)
 {
 	struct ip_moptions *mopts = inp->inp_moptions;
-	struct route *ro = &inp->inp_route;
+	struct rtentry *rt;
 	const struct in_addr *laddr = &inp->inp_laddr;
 	u_int rtableid = inp->inp_rtableid;
 	struct sockaddr	*ip4_source = NULL;
@@ -983,17 +975,14 @@ in_pcbselsrc(struct in_addr *insrc, stru
 	 * If route is known or can be allocated now,
 	 * our src addr is taken from the i/f, else punt.
 	 */
-	if (route_cache(ro, &sin->sin_addr, NULL, rtableid)) {
-		/* No route yet, so try to acquire one */
-		ro->ro_rt = rtalloc_mpath(&ro->ro_dstsa, NULL, ro->ro_tableid);
-	}
+	rt = route_mpath(&inp->inp_route, &sin->sin_addr, NULL, rtableid);
 
 	/*
 	 * If we found a route, use the address
 	 * corresponding to the outgoing interface.
 	 */
-	if (ro->ro_rt != NULL)
-		ia = ifatoia(ro->ro_rt->rt_ifa);
+	if (rt != NULL)
+		ia = ifatoia(rt->rt_ifa);
 
 	/*
 	 * Use preferred source address if :
@@ -1001,8 +990,7 @@ in_pcbselsrc(struct in_addr *insrc, stru
 	 * - preferred source address is set
 	 * - output interface is UP
 	 */
-	if (ro->ro_rt && !(ro->ro_rt->rt_flags & RTF_LLINFO) &&
-	    !(ro->ro_rt->rt_flags & RTF_HOST)) {
+	if (rt && !(rt->rt_flags & RTF_LLINFO) && !(rt->rt_flags & RTF_HOST)) {
 		ip4_source = rtable_getsource(rtableid, AF_INET);
 		if (ip4_source != NULL) {
 			struct ifaddr *ifa;
Index: netinet6/in6_pcb.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6_pcb.c,v
diff -u -p -r1.139 in6_pcb.c
--- netinet6/in6_pcb.c	22 Feb 2024 14:25:58 -0000	1.139
+++ netinet6/in6_pcb.c	26 Feb 2024 18:19:42 -0000
@@ -561,16 +561,10 @@ in6_pcbnotify(struct inpcbtable *table, 
 struct rtentry *
 in6_pcbrtentry(struct inpcb *inp)
 {
-	struct route *ro = &inp->inp_route;
-
 	if (IN6_IS_ADDR_UNSPECIFIED(&inp->inp_faddr6))
 		return (NULL);
-	if (route6_cache(ro, &inp->inp_faddr6, &inp->inp_laddr6,
-	    inp->inp_rtableid)) {
-		ro->ro_rt = rtalloc_mpath(&ro->ro_dstsa,
-		    &inp->inp_laddr6.s6_addr32[0], ro->ro_tableid);
-	}
-	return (ro->ro_rt);
+	return (route6_mpath(&inp->inp_route, &inp->inp_faddr6,
+	    &inp->inp_laddr6, inp->inp_rtableid));
 }
 
 struct inpcb *
Index: netinet6/in6_src.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6_src.c,v
diff -u -p -r1.95 in6_src.c
--- netinet6/in6_src.c	22 Feb 2024 14:25:58 -0000	1.95
+++ netinet6/in6_src.c	26 Feb 2024 18:35:00 -0000
@@ -95,7 +95,7 @@ in6_pcbselsrc(const struct in6_addr **in
     struct inpcb *inp, struct ip6_pktopts *opts)
 {
 	struct ip6_moptions *mopts = inp->inp_moptions6;
-	struct route *ro = &inp->inp_route;
+	struct rtentry *rt;
 	const struct in6_addr *laddr = &inp->inp_laddr6;
 	u_int rtableid = inp->inp_rtableid;
 	struct ifnet *ifp = NULL;
@@ -118,7 +118,8 @@ in6_pcbselsrc(const struct in6_addr **in
 		struct sockaddr_in6 sa6;
 
 		/* get the outgoing interface */
-		error = in6_selectif(dst, opts, mopts, ro, &ifp, rtableid);
+		error = in6_selectif(dst, opts, mopts, &inp->inp_route, &ifp,
+		    rtableid);
 		if (error)
 			return (error);
 
@@ -179,9 +180,7 @@ in6_pcbselsrc(const struct in6_addr **in
 	 * If route is known or can be allocated now,
 	 * our src addr is taken from the i/f, else punt.
 	 */
-	if (route6_cache(ro, dst, NULL, rtableid)) {
-		ro->ro_rt = rtalloc_mpath(&ro->ro_dstsa, NULL, ro->ro_tableid);
-	}
+	rt = route6_mpath(&inp->inp_route, dst, NULL, rtableid);
 
 	/*
 	 * in_pcbconnect() checks out IFF_LOOPBACK to skip using
@@ -190,14 +189,14 @@ in6_pcbselsrc(const struct in6_addr **in
 	 * so doesn't check out IFF_LOOPBACK.
 	 */
 
-	if (ro->ro_rt) {
-		ifp = if_get(ro->ro_rt->rt_ifidx);
+	if (rt != NULL) {
+		ifp = if_get(rt->rt_ifidx);
 		if (ifp != NULL) {
 			ia6 = in6_ifawithscope(ifp, dst, rtableid);
 			if_put(ifp);
 		}
 		if (ia6 == NULL) /* xxx scope error ?*/
-			ia6 = ifatoia6(ro->ro_rt->rt_ifa);
+			ia6 = ifatoia6(rt->rt_ifa);
 	}
 
 	/*
@@ -206,8 +205,7 @@ in6_pcbselsrc(const struct in6_addr **in
 	 * - preferred source address is set
 	 * - output interface is UP
 	 */
-	if (ro->ro_rt && !(ro->ro_rt->rt_flags & RTF_LLINFO) &&
-	    !(ro->ro_rt->rt_flags & RTF_HOST)) {
+	if (rt && !(rt->rt_flags & RTF_LLINFO) && !(rt->rt_flags & RTF_HOST)) {
 		ip6_source = rtable_getsource(rtableid, AF_INET6);
 		if (ip6_source != NULL) {
 			struct ifaddr *ifa;
@@ -304,11 +302,9 @@ in6_selectroute(const struct in6_addr *d
 	 * a new one.
 	 */
 	if (ro) {
-		if (route6_cache(ro, dst, NULL, rtableid)) {
-			/* No route yet, so try to acquire one */
-			ro->ro_rt = rtalloc_mpath(&ro->ro_dstsa, NULL,
-			    ro->ro_tableid);
-		}
+		struct rtentry *rt;
+
+		rt = route6_mpath(ro, dst, NULL, rtableid);
 
 		/*
 		 * Check if the outgoing interface conflicts with
@@ -319,15 +315,13 @@ in6_selectroute(const struct in6_addr *d
 		 */
 		if (opts && opts->ip6po_pktinfo &&
 		    opts->ip6po_pktinfo->ipi6_ifindex) {
-			if (ro->ro_rt != NULL &&
-			    !ISSET(ro->ro_rt->rt_flags, RTF_LOCAL) &&
-			    ro->ro_rt->rt_ifidx !=
-			    opts->ip6po_pktinfo->ipi6_ifindex) {
+			if (rt != NULL && !ISSET(rt->rt_flags, RTF_LOCAL) &&
+			    rt->rt_ifidx != opts->ip6po_pktinfo->ipi6_ifindex) {
 			    	return (NULL);
 			}
 		}
 
-		return (ro->ro_rt);
+		return (rt);
 	}
 
 	return (NULL);
@@ -338,7 +332,7 @@ in6_selectif(const struct in6_addr *dst,
     struct ip6_moptions *mopts, struct route *ro, struct ifnet **retifp,
     u_int rtableid)
 {
-	struct rtentry *rt = NULL;
+	struct rtentry *rt;
 	struct in6_pktinfo *pi = NULL;
 
 	/* If the caller specify the outgoing interface explicitly, use it. */
@@ -377,11 +371,10 @@ in6_selectif(const struct in6_addr *dst,
 	 * Although this may not be very harmful, it should still be confusing.
 	 * We thus reject the case here.
 	 */
-	if (rt && (rt->rt_flags & (RTF_REJECT | RTF_BLACKHOLE)))
+	if (ISSET(rt->rt_flags, RTF_REJECT | RTF_BLACKHOLE))
 		return (rt->rt_flags & RTF_HOST ? EHOSTUNREACH : ENETUNREACH);
 
-	if (rt != NULL)
-		*retifp = if_get(rt->rt_ifidx);
+	*retifp = if_get(rt->rt_ifidx);
 
 	return (0);
 }