From: Florian Obser Subject: Re: add source link-layer address option to rad(8) To: tech@openbsd.org Date: Thu, 16 May 2024 18:28:31 +0200 This one is OK florian - or I take OKs from devs to commit it myself. (some comments inline) On 2024-05-15 19:33 -07, Ryan Vogt wrote: > On Wed, May 15, 2024 at 09:14:14PM +0200, Florian Obser wrote: >> On 2024-05-11 12:34 -07, Ryan Vogt wrote: > > Thank you again for providing another round of such comprehensive > feedback. I've placed a new version of the patch at the bottom, with > the race condition fixed using your code, and I've inlined some more > discussion. > >> > That "I" flag is what I'm looking at. With the default route being in >> > interface scope, you can do things like: >> > >> > $ ping6 fdaa:...:1%en0 >> > >> > but not: >> > >> > $ ping6 fdaa:...:1 >> > >> >> Interesting, I have no idea what they are doing here. The netstat man >> page is not explaining what I is, it and it's not a thing FreeBSD has >> either. At least it's not documented in their netstat man page. > > This one's a little buried in the macOS man pages. From macOS 14.5's > netstat(1) man page, regarding the "I" flag: > > I RTF_IFSCOPE Route is associated with an interface > scope > > And from macOS' route(8) man page: > > Such property allows for the presence of multiple route > entries with the same destination, where each route is > associated with a unique interface. > > On my functional, working-as-intended macOS client, I have 7 entries > in my Internet6 routing table with a destination of "default" > (netstat -rn), but each is associated with a different interface. > > My understanding of how macOS chooses which interface to communicate > over (wireless, ethernet, tunnel, etc.) is that it takes one of those > route entries out of interface scope, making it the global "default" > route. So: > > $ ping6 ipv6.google.com > > will use my non-interface-scope "default" route, but: > > $ ping6 ipv6.google.com%en0 > > lets me specify an interface (whether its corresponding "default" > route is in interface scope or not). > > If there's no source link-layer address option in rad(8)'s RA messages, > all 7 "default" Internet6 routing table entries on my macOS client > stay in interface scope (%en0, and %utun0 through %utun5). With the > source link-layer address option included in rad(8)'s RA messages, > though, my %en0 "default" routing table entry (corresponding to my > wireless interface) comes out of interface scope and acts as the > global "default" route. > >> This kinda maybe looks like a bug in macOS / iOS where they are not >> doing neighbor discovery? > > I like the "kinda maybe" qualifier, because that's my conclusion as > well -- with the same qualifier attached. > > Even in the absence of a source link-layer address option, I'm still > seeing some evidence of neighbour discovery between the macOS client > (neighbour solicitation messages) and the OpenBSD gateway (solicited > neighbour advertisement messages). So, I don't know if maybe the macOS > behaviour (in the absence of a source link-layer address option) is an > intended behaviour, given how macOS prioritizes its multiple default > routes that co-exist on multiple interfaces? > > That's meandering conjecture on my part, though. I just don't know > enough about the intricacies of how interface scope is supposed to > work to be confident in declaring that this behaviour is a bug in > macOS/iOS. > > ...just, kinda maybe probably a bug? > >> But regardless, rad(8) really should send a >> source link-layer address, otherwise every client has to do one more >> round trip. Very nice find! > > Thank you! > > Here's a question I've been pondering, then. Should the source link- > layer address option be a setting in rad.conf(5) at all? > I had the same thought on your very first email but I decided on: "the code is already there and it's not really hurting." rad.conf(5) has a lot of buttons to push, one more doesn't matter that much. I think it's fine. > It feels weird to be playing devil's advocate regarding my own > proposed patch, but it might be worth asking before going forward. > > You mentioned it was an oversight in rad(8) that the source link-layer > address option wasn't implemented initially. You also mentioned that > you don't think OpenBSD can support the inbound load sharing described > in RFC 4861 section 4.2 -- and, that inbound load sharing is the only > reason I can see in the RFC for omitting the source link-layer address > option in router advertisement messages. You also pointed out how, > when the source link-layer address option is included in RA messages, > it saves all clients from having to do an additional round trip. > > So, is there even a compelling case to be made for being able to > exclude the source link-layer address option with a setting in > rad.conf(5)? Or, should the source link-layer address option just > always be included by rad(8) in its RA messages, and we forego the > option to turn it off in rad.conf(5)? > >> > A few quick notes: >> > >> > 1. In build_packet(), I've placed the struct pointers >> > nd_opt_source_link_addr, nd_opt_mtu, and nd_opt_prefix_info in the >> > order they're presented in RFC 4861 section 4.2. >> >> sure, style(9) has size: >> When declaring variables in functions, declare them sorted by size >> (largest to smallest), >> >> I usually can't be arsed to work out which struct is larger, unless I >> know it off the top of my head. > > I need to read more carefully. I could change the location of the Sorry, I meant don't worry about it. It's fine as is. > nd_opt_source_link_addr declaration in rad/frontend.c's > build_packet() to follow style(9), but it's not quite clear where the > declaration would go. For example, the nd_opt_mtu and the > nd_opt_prefix_info are already in backwards order, relative to the > guideline in style(9), so I'd have to start re-ordering other > variables to be able to slot the nd_opt_source_link_addr into a size- > sorted location. > > As it is, the existing variables (pre-patch) have a sane, how-are- > they-used order to them. If it's okay, I'm inclined to follow that > same logical style that's already being used in rad/frontend.c's > build_packet(), and leave that part of the proposed patch unchanged > from the previous version? > >> > 2. I'm uncertain why we're doing a strcmp() against the name of the >> > interface while iterating over ifap, instead of comparing against >> > the interface index. But, since that's how it's done in >> > slaacd/frontend.c, where both the name and the index are also >> > available, I'm assuming there's a reason we use the name instead of >> > the index that I just don't know about. So, I switched to using a >> > strcmp() against the interface name instead of comparing against >> > the index. >> >> I was under the impression that an interface index could be >> recycled when you unplug a usb device and plug in a different one. But >> that does not seem to be true. if_index seems to be monotonically >> increasing. I think it's best to leave this for now as is and maybe >> change all "may" daemons to use if_index. This probably shows up in >> dhcpleased, slaacd and unwind as well. > > Thanks for the explanation. I've left that the same as it was in the > previous version of the proposed patch. > >> > + if (ra_iface_conf->ra_options.source_link_addr) { >> > + ndopt_source_link_addr = (struct nd_opt_source_link_addr *)p; >> > + ndopt_source_link_addr->nd_opt_source_link_addr_type = >> > + ND_OPT_SOURCE_LINKADDR; >> > + ndopt_source_link_addr->nd_opt_source_link_addr_len = 1; >> > + if (getifaddrs(&ifap) != 0) { >> > + ifap = NULL; >> > + log_warn("getifaddrs"); >> > + } >> > + for (ifa = ifap; ifa; ifa = ifa->ifa_next) { >> > + [...snip...] >> > + } >> > + if (ifap != NULL) >> > + freeifaddrs(ifap); >> > + p += sizeof(*ndopt_source_link_addr); >> >> if you fail to find a mac address, for example because you are racing >> against an interface being unplugged, you will (try to) send a source >> link addr option with a mac address of all zero. >> >> I think the easiest way to avoid that is doing something like this: >> >> if (ifap != NULL) { >> freeifaddrs(ifap); >> p += sizeof(*ndopt_source_link_addr); >> } else >> len -= sizeof(*ndopt_source_link_addr); > > Nice catch! I've updated the proposed patch below with this change > (that's the only change from the previous version). > > diff --git a/frontend.c b/frontend.c > index 6748e43732d..5370cb3359b 100644 > --- a/frontend.c > +++ b/frontend.c > @@ -56,6 +56,7 @@ > #include > > #include > +#include > #include > #include > > @@ -120,6 +121,12 @@ struct nd_opt_pref64 { > u_int8_t nd_opt_pref64[12]; > }; > > +struct nd_opt_source_link_addr { > + u_int8_t nd_opt_source_link_addr_type; > + u_int8_t nd_opt_source_link_addr_len; > + struct ether_addr nd_opt_source_link_addr_hw_addr; > +}; > + > TAILQ_HEAD(, ra_iface) ra_interfaces; > > __dead void frontend_shutdown(void); > @@ -1099,6 +1106,7 @@ void > build_packet(struct ra_iface *ra_iface) > { > struct nd_router_advert *ra; > + struct nd_opt_source_link_addr *ndopt_source_link_addr; > struct nd_opt_mtu *ndopt_mtu; > struct nd_opt_prefix_info *ndopt_pi; > struct ra_iface_conf *ra_iface_conf; > @@ -1110,6 +1118,8 @@ build_packet(struct ra_iface *ra_iface) > struct ra_rdnss_conf *ra_rdnss; > struct ra_dnssl_conf *ra_dnssl; > struct ra_pref64_conf *pref64; > + struct ifaddrs *ifap, *ifa; > + struct sockaddr_dl *sdl; > size_t len, label_len; > uint8_t *p, buf[RA_MAX_SIZE]; > char *label_start, *label_end; > @@ -1119,6 +1129,8 @@ build_packet(struct ra_iface *ra_iface) > ra_options_conf = &ra_iface_conf->ra_options; > > len = sizeof(*ra); > + if (ra_iface_conf->ra_options.source_link_addr) > + len += sizeof(*ndopt_source_link_addr); > if (ra_options_conf->mtu > 0) > len += sizeof(*ndopt_mtu); > len += sizeof(*ndopt_pi) * ra_iface->prefix_count; > @@ -1170,6 +1182,37 @@ build_packet(struct ra_iface *ra_iface) > ra->nd_ra_retransmit = htonl(ra_options_conf->retrans_timer); > p += sizeof(*ra); > > + if (ra_iface_conf->ra_options.source_link_addr) { > + ndopt_source_link_addr = (struct nd_opt_source_link_addr *)p; > + ndopt_source_link_addr->nd_opt_source_link_addr_type = > + ND_OPT_SOURCE_LINKADDR; > + ndopt_source_link_addr->nd_opt_source_link_addr_len = 1; > + if (getifaddrs(&ifap) != 0) { > + ifap = NULL; > + log_warn("getifaddrs"); > + } > + for (ifa = ifap; ifa; ifa = ifa->ifa_next) { > + if (ifa->ifa_addr == NULL || > + ifa->ifa_addr->sa_family != AF_LINK) > + continue; > + if (strcmp(ra_iface->name, ifa->ifa_name) != 0) > + continue; > + sdl = (struct sockaddr_dl *)ifa->ifa_addr; > + if (sdl->sdl_type != IFT_ETHER || > + sdl->sdl_alen != ETHER_ADDR_LEN) > + continue; > + memcpy(&ndopt_source_link_addr-> > + nd_opt_source_link_addr_hw_addr, > + LLADDR(sdl), ETHER_ADDR_LEN); > + break; > + } > + if (ifap != NULL) { > + freeifaddrs(ifap); > + p += sizeof(*ndopt_source_link_addr); > + } else > + len -= sizeof(*ndopt_source_link_addr); > + } > + > if (ra_options_conf->mtu > 0) { > ndopt_mtu = (struct nd_opt_mtu *)p; > ndopt_mtu->nd_opt_mtu_type = ND_OPT_MTU; > diff --git a/parse.y b/parse.y > index 66cc0c0ab64..b9d369bde17 100644 > --- a/parse.y > +++ b/parse.y > @@ -122,6 +122,7 @@ typedef struct { > %token CONFIGURATION OTHER LIFETIME REACHABLE TIME RETRANS TIMER > %token AUTO PREFIX VALID PREFERENCE PREFERRED LIFETIME ONLINK AUTONOMOUS > %token ADDRESS_CONFIGURATION DNS NAMESERVER SEARCH MTU NAT64 HIGH MEDIUM LOW > +%token SOURCE LINK_LAYER > > %token STRING > %token NUMBER > @@ -226,6 +227,9 @@ ra_opt_block : DEFAULT ROUTER yesno { > | RETRANS TIMER NUMBER { > ra_options->retrans_timer = $3; > } > + | SOURCE LINK_LAYER ADDRESS yesno { > + ra_options->source_link_addr = $4; > + } > | MTU NUMBER { > ra_options->mtu = $2; > } > @@ -523,6 +527,7 @@ lookup(char *s) > {"interface", RA_IFACE}, > {"lifetime", LIFETIME}, > {"limit", LIMIT}, > + {"link-layer", LINK_LAYER}, > {"low", LOW}, > {"managed", MANAGED}, > {"medium", MEDIUM}, > @@ -539,6 +544,7 @@ lookup(char *s) > {"retrans", RETRANS}, > {"router", ROUTER}, > {"search", SEARCH}, > + {"source", SOURCE}, > {"time", TIME}, > {"timer", TIMER}, > {"valid", VALID}, > diff --git a/printconf.c b/printconf.c > index 184a5df2dc1..2edfb06edd4 100644 > --- a/printconf.c > +++ b/printconf.c > @@ -78,6 +78,8 @@ print_ra_options(const char *indent, const struct ra_options_conf *ra_options) > printf("%srouter lifetime %d\n", indent, ra_options->router_lifetime); > printf("%sreachable time %u\n", indent, ra_options->reachable_time); > printf("%sretrans timer %u\n", indent, ra_options->retrans_timer); > + printf("%ssource link-layer address %s\n", indent, > + yesno(ra_options->source_link_addr)); > if (ra_options->mtu > 0) > printf("%smtu %u\n", indent, ra_options->mtu); > > diff --git a/rad.c b/rad.c > index b283e58fbf1..9093932c563 100644 > --- a/rad.c > +++ b/rad.c > @@ -757,6 +757,7 @@ config_new_empty(void) > xconf->ra_options.router_lifetime = ADV_DEFAULT_LIFETIME; > xconf->ra_options.reachable_time = 0; > xconf->ra_options.retrans_timer = 0; > + xconf->ra_options.source_link_addr = 1; > xconf->ra_options.mtu = 0; > xconf->ra_options.rdns_lifetime = DEFAULT_RDNS_LIFETIME; > SIMPLEQ_INIT(&xconf->ra_options.ra_rdnss_list); > diff --git a/rad.conf.5 b/rad.conf.5 > index f82ccfd216e..a2589cda88c 100644 > --- a/rad.conf.5 > +++ b/rad.conf.5 > @@ -127,6 +127,10 @@ The default is medium. > .\" XXX > .\" .It Ic retrans timer Ar number > .\" XXX > +.It Ic source link-layer address Pq Ic yes Ns | Ns Ic no > +Add a source link-layer address option to router advertisement messages, to > +communicate the link-layer address of the sending interface. > +The default is yes. > .El > .Sh INTERFACES > A list of interfaces or interface groups to send advertisements on: > diff --git a/rad.h b/rad.h > index 787e78c7c4a..61187dba273 100644 > --- a/rad.h > +++ b/rad.h > @@ -97,6 +97,7 @@ struct ra_options_conf { > int router_lifetime; /* default router lifetime */ > uint32_t reachable_time; > uint32_t retrans_timer; > + int source_link_addr; /* source link-layer address */ > uint32_t mtu; > uint32_t rdns_lifetime; > SIMPLEQ_HEAD(, ra_rdnss_conf) ra_rdnss_list; > -- In my defence, I have been left unsupervised.