Index | Thread | Search

From:
Florian Obser <florian@openbsd.org>
Subject:
Re: add source link-layer address option to rad(8)
To:
tech@openbsd.org
Date:
Thu, 16 May 2024 18:28:31 +0200

Download raw body.

Thread
This one is OK florian - or I take OKs from devs to commit it myself.

(some comments inline)

On 2024-05-15 19:33 -07, Ryan Vogt <rvogt.ca@gmail.com> wrote:
> On Wed, May 15, 2024 at 09:14:14PM +0200, Florian Obser wrote:
>> On 2024-05-11 12:34 -07, Ryan Vogt <rvogt.ca@gmail.com> wrote:
>
> Thank you again for providing another round of such comprehensive
> feedback. I've placed a new version of the patch at the bottom, with
> the race condition fixed using your code, and I've inlined some more
> discussion.
>
>> > That "I" flag is what I'm looking at. With the default route being in
>> > interface scope, you can do things like:
>> >
>> >     $ ping6 fdaa:...:1%en0
>> >
>> > but not:
>> >
>> >     $ ping6 fdaa:...:1
>> >
>> 
>> Interesting, I have no idea what they are doing here. The netstat man
>> page is not explaining what I is, it and it's not a thing FreeBSD has
>> either. At least it's not documented in their netstat man page.
>
> This one's a little buried in the macOS man pages. From macOS 14.5's
> netstat(1) man page, regarding the "I" flag:
>
> 	I	RTF_IFSCOPE	Route is associated with an interface
> 				scope
>
> And from macOS' route(8) man page:
>
> 	Such property allows for the presence of multiple route
> 	entries with the same destination, where each route is
> 	associated with a unique interface.
>
> On my functional, working-as-intended macOS client, I have 7 entries
> in my Internet6 routing table with a destination of "default"
> (netstat -rn), but each is associated with a different interface.
>
> My understanding of how macOS chooses which interface to communicate
> over (wireless, ethernet, tunnel, etc.) is that it takes one of those
> route entries out of interface scope, making it the global "default"
> route. So:
>
> 	$ ping6 ipv6.google.com
>
> will use my non-interface-scope "default" route, but:
>
> 	$ ping6 ipv6.google.com%en0
>
> lets me specify an interface (whether its corresponding "default"
> route is in interface scope or not).
>
> If there's no source link-layer address option in rad(8)'s RA messages,
> all 7 "default" Internet6 routing table entries on my macOS client
> stay in interface scope (%en0, and %utun0 through %utun5). With the
> source link-layer address option included in rad(8)'s RA messages,
> though, my %en0 "default" routing table entry (corresponding to my
> wireless interface) comes out of interface scope and acts as the
> global "default" route.
>
>> This kinda maybe looks like a bug in macOS / iOS where they are not
>> doing neighbor discovery?
>
> I like the "kinda maybe" qualifier, because that's my conclusion as
> well -- with the same qualifier attached.
>
> Even in the absence of a source link-layer address option, I'm still
> seeing some evidence of neighbour discovery between the macOS client
> (neighbour solicitation messages) and the OpenBSD gateway (solicited
> neighbour advertisement messages). So, I don't know if maybe the macOS
> behaviour (in the absence of a source link-layer address option) is an
> intended behaviour, given how macOS prioritizes its multiple default
> routes that co-exist on multiple interfaces?
>
> That's meandering conjecture on my part, though. I just don't know
> enough about the intricacies of how interface scope is supposed to
> work to be confident in declaring that this behaviour is a bug in
> macOS/iOS.
>
> ...just, kinda maybe probably a bug?
>
>> But regardless, rad(8) really should send a
>> source link-layer address, otherwise every client has to do one more
>> round trip. Very nice find!
>
> Thank you!
>
> Here's a question I've been pondering, then. Should the source link-
> layer address option be a setting in rad.conf(5) at all?
>

I had the same thought on your very first email but I decided on: "the
code is already there and it's not really hurting."

rad.conf(5) has a lot of buttons to push, one more doesn't matter that
much. I think it's fine.

> It feels weird to be playing devil's advocate regarding my own
> proposed patch, but it might be worth asking before going forward.
>
> You mentioned it was an oversight in rad(8) that the source link-layer
> address option wasn't implemented initially. You also mentioned that
> you don't think OpenBSD can support the inbound load sharing described
> in RFC 4861 section 4.2 -- and, that inbound load sharing is the only
> reason I can see in the RFC for omitting the source link-layer address
> option in router advertisement messages. You also pointed out how,
> when the source link-layer address option is included in RA messages,
> it saves all clients from having to do an additional round trip.
>
> So, is there even a compelling case to be made for being able to
> exclude the source link-layer address option with a setting in
> rad.conf(5)?  Or, should the source link-layer address option just
> always be included by rad(8) in its RA messages, and we forego the
> option to turn it off in rad.conf(5)?
>
>> > A few quick notes:
>> >
>> > 1. In build_packet(), I've placed the struct pointers
>> >    nd_opt_source_link_addr, nd_opt_mtu, and nd_opt_prefix_info in the
>> >    order they're presented in RFC 4861 section 4.2.
>> 
>> sure, style(9) has size:
>>      When declaring variables in functions, declare them sorted by size
>>      (largest to smallest),
>> 
>> I usually can't be arsed to work out which struct is larger, unless I
>> know it off the top of my head.
>
> I need to read more carefully. I could change the location of the

Sorry, I meant don't worry about it. It's fine as is.

> nd_opt_source_link_addr declaration in rad/frontend.c's
> build_packet() to follow style(9), but it's not quite clear where the
> declaration would go. For example, the nd_opt_mtu and the
> nd_opt_prefix_info are already in backwards order, relative to the
> guideline in style(9), so I'd have to start re-ordering other
> variables to be able to slot the nd_opt_source_link_addr into a size-
> sorted location.
>
> As it is, the existing variables (pre-patch) have a sane, how-are-
> they-used order to them. If it's okay, I'm inclined to follow that
> same logical style that's already being used in rad/frontend.c's
> build_packet(), and leave that part of the proposed patch unchanged
> from the previous version?
>
>> > 2. I'm uncertain why we're doing a strcmp() against the name of the
>> >    interface while iterating over ifap, instead of comparing against
>> >    the interface index. But, since that's how it's done in
>> >    slaacd/frontend.c, where both the name and the index are also
>> >    available, I'm assuming there's a reason we use the name instead of
>> >    the index that I just don't know about. So, I switched to using a
>> >    strcmp() against the interface name instead of comparing against
>> >    the index.
>> 
>> I was under the impression that an interface index could be
>> recycled when you unplug a usb device and plug in a different one. But
>> that does not seem to be true. if_index seems to be monotonically
>> increasing. I think it's best to leave this for now as is and maybe
>> change all "may" daemons to use if_index. This probably shows up in
>> dhcpleased, slaacd and unwind as well.
>
> Thanks for the explanation. I've left that the same as it was in the
> previous version of the proposed patch.
>
>> > +	if (ra_iface_conf->ra_options.source_link_addr) {
>> > +		ndopt_source_link_addr = (struct nd_opt_source_link_addr *)p;
>> > +		ndopt_source_link_addr->nd_opt_source_link_addr_type =
>> > +		    ND_OPT_SOURCE_LINKADDR;
>> > +		ndopt_source_link_addr->nd_opt_source_link_addr_len = 1;
>> > +		if (getifaddrs(&ifap) != 0) {
>> > +			ifap = NULL;
>> > +			log_warn("getifaddrs");
>> > +		}
>> > +		for (ifa = ifap; ifa; ifa = ifa->ifa_next) {
>> > +			[...snip...]
>> > +		}
>> > +		if (ifap != NULL)
>> > +			freeifaddrs(ifap);
>> > +		p += sizeof(*ndopt_source_link_addr);
>> 
>> if you fail to find a mac address, for example because you are racing
>> against an interface being unplugged, you will (try to) send a source
>> link addr option with a mac address of all zero.
>> 
>> I think the easiest way to avoid that is doing something like this:
>> 
>> 		if (ifap != NULL) {
>> 			freeifaddrs(ifap);
>> 			p += sizeof(*ndopt_source_link_addr);
>> 		} else
>> 			len -= sizeof(*ndopt_source_link_addr);
>
> Nice catch! I've updated the proposed patch below with this change
> (that's the only change from the previous version).
>
> diff --git a/frontend.c b/frontend.c
> index 6748e43732d..5370cb3359b 100644
> --- a/frontend.c
> +++ b/frontend.c
> @@ -56,6 +56,7 @@
>  #include <sys/uio.h>
>  
>  #include <net/if.h>
> +#include <net/if_dl.h>
>  #include <net/if_types.h>
>  #include <net/route.h>
>  
> @@ -120,6 +121,12 @@ struct nd_opt_pref64 {
>  	u_int8_t	nd_opt_pref64[12];
>  };
>  
> +struct nd_opt_source_link_addr {
> +	u_int8_t		nd_opt_source_link_addr_type;
> +	u_int8_t		nd_opt_source_link_addr_len;
> +	struct ether_addr	nd_opt_source_link_addr_hw_addr;
> +};
> +
>  TAILQ_HEAD(, ra_iface)	ra_interfaces;
>  
>  __dead void		 frontend_shutdown(void);
> @@ -1099,6 +1106,7 @@ void
>  build_packet(struct ra_iface *ra_iface)
>  {
>  	struct nd_router_advert		*ra;
> +	struct nd_opt_source_link_addr	*ndopt_source_link_addr;
>  	struct nd_opt_mtu		*ndopt_mtu;
>  	struct nd_opt_prefix_info	*ndopt_pi;
>  	struct ra_iface_conf		*ra_iface_conf;
> @@ -1110,6 +1118,8 @@ build_packet(struct ra_iface *ra_iface)
>  	struct ra_rdnss_conf		*ra_rdnss;
>  	struct ra_dnssl_conf		*ra_dnssl;
>  	struct ra_pref64_conf		*pref64;
> +	struct ifaddrs			*ifap, *ifa;
> +	struct sockaddr_dl		*sdl;
>  	size_t				 len, label_len;
>  	uint8_t				*p, buf[RA_MAX_SIZE];
>  	char				*label_start, *label_end;
> @@ -1119,6 +1129,8 @@ build_packet(struct ra_iface *ra_iface)
>  	ra_options_conf = &ra_iface_conf->ra_options;
>  
>  	len = sizeof(*ra);
> +	if (ra_iface_conf->ra_options.source_link_addr)
> +		len += sizeof(*ndopt_source_link_addr);
>  	if (ra_options_conf->mtu > 0)
>  		len += sizeof(*ndopt_mtu);
>  	len += sizeof(*ndopt_pi) * ra_iface->prefix_count;
> @@ -1170,6 +1182,37 @@ build_packet(struct ra_iface *ra_iface)
>  	ra->nd_ra_retransmit = htonl(ra_options_conf->retrans_timer);
>  	p += sizeof(*ra);
>  
> +	if (ra_iface_conf->ra_options.source_link_addr) {
> +		ndopt_source_link_addr = (struct nd_opt_source_link_addr *)p;
> +		ndopt_source_link_addr->nd_opt_source_link_addr_type =
> +		    ND_OPT_SOURCE_LINKADDR;
> +		ndopt_source_link_addr->nd_opt_source_link_addr_len = 1;
> +		if (getifaddrs(&ifap) != 0) {
> +			ifap = NULL;
> +			log_warn("getifaddrs");
> +		}
> +		for (ifa = ifap; ifa; ifa = ifa->ifa_next) {
> +			if (ifa->ifa_addr == NULL ||
> +			    ifa->ifa_addr->sa_family != AF_LINK)
> +				continue;
> +			if (strcmp(ra_iface->name, ifa->ifa_name) != 0)
> +				continue;
> +			sdl = (struct sockaddr_dl *)ifa->ifa_addr;
> +			if (sdl->sdl_type != IFT_ETHER ||
> +			    sdl->sdl_alen != ETHER_ADDR_LEN)
> +				continue;
> +			memcpy(&ndopt_source_link_addr->
> +			    nd_opt_source_link_addr_hw_addr,
> +			    LLADDR(sdl), ETHER_ADDR_LEN);
> +			break;
> +		}
> +		if (ifap != NULL) {
> +			freeifaddrs(ifap);
> +			p += sizeof(*ndopt_source_link_addr);
> +		} else
> +			len -= sizeof(*ndopt_source_link_addr);
> +	}
> +
>  	if (ra_options_conf->mtu > 0) {
>  		ndopt_mtu = (struct nd_opt_mtu *)p;
>  		ndopt_mtu->nd_opt_mtu_type = ND_OPT_MTU;
> diff --git a/parse.y b/parse.y
> index 66cc0c0ab64..b9d369bde17 100644
> --- a/parse.y
> +++ b/parse.y
> @@ -122,6 +122,7 @@ typedef struct {
>  %token	CONFIGURATION OTHER LIFETIME REACHABLE TIME RETRANS TIMER
>  %token	AUTO PREFIX VALID PREFERENCE PREFERRED LIFETIME ONLINK AUTONOMOUS
>  %token	ADDRESS_CONFIGURATION DNS NAMESERVER SEARCH MTU NAT64 HIGH MEDIUM LOW
> +%token	SOURCE LINK_LAYER
>  
>  %token	<v.string>	STRING
>  %token	<v.number>	NUMBER
> @@ -226,6 +227,9 @@ ra_opt_block	: DEFAULT ROUTER yesno {
>  		| RETRANS TIMER NUMBER {
>  			ra_options->retrans_timer = $3;
>  		}
> +		| SOURCE LINK_LAYER ADDRESS yesno {
> +			ra_options->source_link_addr = $4;
> +		}
>  		| MTU NUMBER {
>  			ra_options->mtu = $2;
>  		}
> @@ -523,6 +527,7 @@ lookup(char *s)
>  		{"interface",		RA_IFACE},
>  		{"lifetime",		LIFETIME},
>  		{"limit",		LIMIT},
> +		{"link-layer",		LINK_LAYER},
>  		{"low",			LOW},
>  		{"managed",		MANAGED},
>  		{"medium",		MEDIUM},
> @@ -539,6 +544,7 @@ lookup(char *s)
>  		{"retrans",		RETRANS},
>  		{"router",		ROUTER},
>  		{"search",		SEARCH},
> +		{"source",		SOURCE},
>  		{"time",		TIME},
>  		{"timer",		TIMER},
>  		{"valid",		VALID},
> diff --git a/printconf.c b/printconf.c
> index 184a5df2dc1..2edfb06edd4 100644
> --- a/printconf.c
> +++ b/printconf.c
> @@ -78,6 +78,8 @@ print_ra_options(const char *indent, const struct ra_options_conf *ra_options)
>  	printf("%srouter lifetime %d\n", indent, ra_options->router_lifetime);
>  	printf("%sreachable time %u\n", indent, ra_options->reachable_time);
>  	printf("%sretrans timer %u\n", indent, ra_options->retrans_timer);
> +	printf("%ssource link-layer address %s\n", indent,
> +	    yesno(ra_options->source_link_addr));
>  	if (ra_options->mtu > 0)
>  		printf("%smtu %u\n", indent, ra_options->mtu);
>  
> diff --git a/rad.c b/rad.c
> index b283e58fbf1..9093932c563 100644
> --- a/rad.c
> +++ b/rad.c
> @@ -757,6 +757,7 @@ config_new_empty(void)
>  	xconf->ra_options.router_lifetime = ADV_DEFAULT_LIFETIME;
>  	xconf->ra_options.reachable_time = 0;
>  	xconf->ra_options.retrans_timer = 0;
> +	xconf->ra_options.source_link_addr = 1;
>  	xconf->ra_options.mtu = 0;
>  	xconf->ra_options.rdns_lifetime = DEFAULT_RDNS_LIFETIME;
>  	SIMPLEQ_INIT(&xconf->ra_options.ra_rdnss_list);
> diff --git a/rad.conf.5 b/rad.conf.5
> index f82ccfd216e..a2589cda88c 100644
> --- a/rad.conf.5
> +++ b/rad.conf.5
> @@ -127,6 +127,10 @@ The default is medium.
>  .\" XXX
>  .\" .It Ic retrans timer Ar number
>  .\" XXX
> +.It Ic source link-layer address Pq Ic yes Ns | Ns Ic no
> +Add a source link-layer address option to router advertisement messages, to
> +communicate the link-layer address of the sending interface.
> +The default is yes.
>  .El
>  .Sh INTERFACES
>  A list of interfaces or interface groups to send advertisements on:
> diff --git a/rad.h b/rad.h
> index 787e78c7c4a..61187dba273 100644
> --- a/rad.h
> +++ b/rad.h
> @@ -97,6 +97,7 @@ struct ra_options_conf {
>  	int		router_lifetime;	/* default router lifetime */
>  	uint32_t	reachable_time;
>  	uint32_t	retrans_timer;
> +	int		source_link_addr;	/* source link-layer address */
>  	uint32_t	mtu;
>  	uint32_t	rdns_lifetime;
>  	SIMPLEQ_HEAD(, ra_rdnss_conf)		 ra_rdnss_list;
>

-- 
In my defence, I have been left unsupervised.