Download raw body.
add source link-layer address option to rad(8)
On Wed, May 15, 2024 at 09:14:14PM +0200, Florian Obser wrote:
> On 2024-05-11 12:34 -07, Ryan Vogt <rvogt.ca@gmail.com> wrote:
Thank you again for providing another round of such comprehensive
feedback. I've placed a new version of the patch at the bottom, with
the race condition fixed using your code, and I've inlined some more
discussion.
> > That "I" flag is what I'm looking at. With the default route being in
> > interface scope, you can do things like:
> >
> > $ ping6 fdaa:...:1%en0
> >
> > but not:
> >
> > $ ping6 fdaa:...:1
> >
>
> Interesting, I have no idea what they are doing here. The netstat man
> page is not explaining what I is, it and it's not a thing FreeBSD has
> either. At least it's not documented in their netstat man page.
This one's a little buried in the macOS man pages. From macOS 14.5's
netstat(1) man page, regarding the "I" flag:
I RTF_IFSCOPE Route is associated with an interface
scope
And from macOS' route(8) man page:
Such property allows for the presence of multiple route
entries with the same destination, where each route is
associated with a unique interface.
On my functional, working-as-intended macOS client, I have 7 entries
in my Internet6 routing table with a destination of "default"
(netstat -rn), but each is associated with a different interface.
My understanding of how macOS chooses which interface to communicate
over (wireless, ethernet, tunnel, etc.) is that it takes one of those
route entries out of interface scope, making it the global "default"
route. So:
$ ping6 ipv6.google.com
will use my non-interface-scope "default" route, but:
$ ping6 ipv6.google.com%en0
lets me specify an interface (whether its corresponding "default"
route is in interface scope or not).
If there's no source link-layer address option in rad(8)'s RA messages,
all 7 "default" Internet6 routing table entries on my macOS client
stay in interface scope (%en0, and %utun0 through %utun5). With the
source link-layer address option included in rad(8)'s RA messages,
though, my %en0 "default" routing table entry (corresponding to my
wireless interface) comes out of interface scope and acts as the
global "default" route.
> This kinda maybe looks like a bug in macOS / iOS where they are not
> doing neighbor discovery?
I like the "kinda maybe" qualifier, because that's my conclusion as
well -- with the same qualifier attached.
Even in the absence of a source link-layer address option, I'm still
seeing some evidence of neighbour discovery between the macOS client
(neighbour solicitation messages) and the OpenBSD gateway (solicited
neighbour advertisement messages). So, I don't know if maybe the macOS
behaviour (in the absence of a source link-layer address option) is an
intended behaviour, given how macOS prioritizes its multiple default
routes that co-exist on multiple interfaces?
That's meandering conjecture on my part, though. I just don't know
enough about the intricacies of how interface scope is supposed to
work to be confident in declaring that this behaviour is a bug in
macOS/iOS.
...just, kinda maybe probably a bug?
> But regardless, rad(8) really should send a
> source link-layer address, otherwise every client has to do one more
> round trip. Very nice find!
Thank you!
Here's a question I've been pondering, then. Should the source link-
layer address option be a setting in rad.conf(5) at all?
It feels weird to be playing devil's advocate regarding my own
proposed patch, but it might be worth asking before going forward.
You mentioned it was an oversight in rad(8) that the source link-layer
address option wasn't implemented initially. You also mentioned that
you don't think OpenBSD can support the inbound load sharing described
in RFC 4861 section 4.2 -- and, that inbound load sharing is the only
reason I can see in the RFC for omitting the source link-layer address
option in router advertisement messages. You also pointed out how,
when the source link-layer address option is included in RA messages,
it saves all clients from having to do an additional round trip.
So, is there even a compelling case to be made for being able to
exclude the source link-layer address option with a setting in
rad.conf(5)? Or, should the source link-layer address option just
always be included by rad(8) in its RA messages, and we forego the
option to turn it off in rad.conf(5)?
> > A few quick notes:
> >
> > 1. In build_packet(), I've placed the struct pointers
> > nd_opt_source_link_addr, nd_opt_mtu, and nd_opt_prefix_info in the
> > order they're presented in RFC 4861 section 4.2.
>
> sure, style(9) has size:
> When declaring variables in functions, declare them sorted by size
> (largest to smallest),
>
> I usually can't be arsed to work out which struct is larger, unless I
> know it off the top of my head.
I need to read more carefully. I could change the location of the
nd_opt_source_link_addr declaration in rad/frontend.c's
build_packet() to follow style(9), but it's not quite clear where the
declaration would go. For example, the nd_opt_mtu and the
nd_opt_prefix_info are already in backwards order, relative to the
guideline in style(9), so I'd have to start re-ordering other
variables to be able to slot the nd_opt_source_link_addr into a size-
sorted location.
As it is, the existing variables (pre-patch) have a sane, how-are-
they-used order to them. If it's okay, I'm inclined to follow that
same logical style that's already being used in rad/frontend.c's
build_packet(), and leave that part of the proposed patch unchanged
from the previous version?
> > 2. I'm uncertain why we're doing a strcmp() against the name of the
> > interface while iterating over ifap, instead of comparing against
> > the interface index. But, since that's how it's done in
> > slaacd/frontend.c, where both the name and the index are also
> > available, I'm assuming there's a reason we use the name instead of
> > the index that I just don't know about. So, I switched to using a
> > strcmp() against the interface name instead of comparing against
> > the index.
>
> I was under the impression that an interface index could be
> recycled when you unplug a usb device and plug in a different one. But
> that does not seem to be true. if_index seems to be monotonically
> increasing. I think it's best to leave this for now as is and maybe
> change all "may" daemons to use if_index. This probably shows up in
> dhcpleased, slaacd and unwind as well.
Thanks for the explanation. I've left that the same as it was in the
previous version of the proposed patch.
> > + if (ra_iface_conf->ra_options.source_link_addr) {
> > + ndopt_source_link_addr = (struct nd_opt_source_link_addr *)p;
> > + ndopt_source_link_addr->nd_opt_source_link_addr_type =
> > + ND_OPT_SOURCE_LINKADDR;
> > + ndopt_source_link_addr->nd_opt_source_link_addr_len = 1;
> > + if (getifaddrs(&ifap) != 0) {
> > + ifap = NULL;
> > + log_warn("getifaddrs");
> > + }
> > + for (ifa = ifap; ifa; ifa = ifa->ifa_next) {
> > + [...snip...]
> > + }
> > + if (ifap != NULL)
> > + freeifaddrs(ifap);
> > + p += sizeof(*ndopt_source_link_addr);
>
> if you fail to find a mac address, for example because you are racing
> against an interface being unplugged, you will (try to) send a source
> link addr option with a mac address of all zero.
>
> I think the easiest way to avoid that is doing something like this:
>
> if (ifap != NULL) {
> freeifaddrs(ifap);
> p += sizeof(*ndopt_source_link_addr);
> } else
> len -= sizeof(*ndopt_source_link_addr);
Nice catch! I've updated the proposed patch below with this change
(that's the only change from the previous version).
diff --git a/frontend.c b/frontend.c
index 6748e43732d..5370cb3359b 100644
--- a/frontend.c
+++ b/frontend.c
@@ -56,6 +56,7 @@
#include <sys/uio.h>
#include <net/if.h>
+#include <net/if_dl.h>
#include <net/if_types.h>
#include <net/route.h>
@@ -120,6 +121,12 @@ struct nd_opt_pref64 {
u_int8_t nd_opt_pref64[12];
};
+struct nd_opt_source_link_addr {
+ u_int8_t nd_opt_source_link_addr_type;
+ u_int8_t nd_opt_source_link_addr_len;
+ struct ether_addr nd_opt_source_link_addr_hw_addr;
+};
+
TAILQ_HEAD(, ra_iface) ra_interfaces;
__dead void frontend_shutdown(void);
@@ -1099,6 +1106,7 @@ void
build_packet(struct ra_iface *ra_iface)
{
struct nd_router_advert *ra;
+ struct nd_opt_source_link_addr *ndopt_source_link_addr;
struct nd_opt_mtu *ndopt_mtu;
struct nd_opt_prefix_info *ndopt_pi;
struct ra_iface_conf *ra_iface_conf;
@@ -1110,6 +1118,8 @@ build_packet(struct ra_iface *ra_iface)
struct ra_rdnss_conf *ra_rdnss;
struct ra_dnssl_conf *ra_dnssl;
struct ra_pref64_conf *pref64;
+ struct ifaddrs *ifap, *ifa;
+ struct sockaddr_dl *sdl;
size_t len, label_len;
uint8_t *p, buf[RA_MAX_SIZE];
char *label_start, *label_end;
@@ -1119,6 +1129,8 @@ build_packet(struct ra_iface *ra_iface)
ra_options_conf = &ra_iface_conf->ra_options;
len = sizeof(*ra);
+ if (ra_iface_conf->ra_options.source_link_addr)
+ len += sizeof(*ndopt_source_link_addr);
if (ra_options_conf->mtu > 0)
len += sizeof(*ndopt_mtu);
len += sizeof(*ndopt_pi) * ra_iface->prefix_count;
@@ -1170,6 +1182,37 @@ build_packet(struct ra_iface *ra_iface)
ra->nd_ra_retransmit = htonl(ra_options_conf->retrans_timer);
p += sizeof(*ra);
+ if (ra_iface_conf->ra_options.source_link_addr) {
+ ndopt_source_link_addr = (struct nd_opt_source_link_addr *)p;
+ ndopt_source_link_addr->nd_opt_source_link_addr_type =
+ ND_OPT_SOURCE_LINKADDR;
+ ndopt_source_link_addr->nd_opt_source_link_addr_len = 1;
+ if (getifaddrs(&ifap) != 0) {
+ ifap = NULL;
+ log_warn("getifaddrs");
+ }
+ for (ifa = ifap; ifa; ifa = ifa->ifa_next) {
+ if (ifa->ifa_addr == NULL ||
+ ifa->ifa_addr->sa_family != AF_LINK)
+ continue;
+ if (strcmp(ra_iface->name, ifa->ifa_name) != 0)
+ continue;
+ sdl = (struct sockaddr_dl *)ifa->ifa_addr;
+ if (sdl->sdl_type != IFT_ETHER ||
+ sdl->sdl_alen != ETHER_ADDR_LEN)
+ continue;
+ memcpy(&ndopt_source_link_addr->
+ nd_opt_source_link_addr_hw_addr,
+ LLADDR(sdl), ETHER_ADDR_LEN);
+ break;
+ }
+ if (ifap != NULL) {
+ freeifaddrs(ifap);
+ p += sizeof(*ndopt_source_link_addr);
+ } else
+ len -= sizeof(*ndopt_source_link_addr);
+ }
+
if (ra_options_conf->mtu > 0) {
ndopt_mtu = (struct nd_opt_mtu *)p;
ndopt_mtu->nd_opt_mtu_type = ND_OPT_MTU;
diff --git a/parse.y b/parse.y
index 66cc0c0ab64..b9d369bde17 100644
--- a/parse.y
+++ b/parse.y
@@ -122,6 +122,7 @@ typedef struct {
%token CONFIGURATION OTHER LIFETIME REACHABLE TIME RETRANS TIMER
%token AUTO PREFIX VALID PREFERENCE PREFERRED LIFETIME ONLINK AUTONOMOUS
%token ADDRESS_CONFIGURATION DNS NAMESERVER SEARCH MTU NAT64 HIGH MEDIUM LOW
+%token SOURCE LINK_LAYER
%token <v.string> STRING
%token <v.number> NUMBER
@@ -226,6 +227,9 @@ ra_opt_block : DEFAULT ROUTER yesno {
| RETRANS TIMER NUMBER {
ra_options->retrans_timer = $3;
}
+ | SOURCE LINK_LAYER ADDRESS yesno {
+ ra_options->source_link_addr = $4;
+ }
| MTU NUMBER {
ra_options->mtu = $2;
}
@@ -523,6 +527,7 @@ lookup(char *s)
{"interface", RA_IFACE},
{"lifetime", LIFETIME},
{"limit", LIMIT},
+ {"link-layer", LINK_LAYER},
{"low", LOW},
{"managed", MANAGED},
{"medium", MEDIUM},
@@ -539,6 +544,7 @@ lookup(char *s)
{"retrans", RETRANS},
{"router", ROUTER},
{"search", SEARCH},
+ {"source", SOURCE},
{"time", TIME},
{"timer", TIMER},
{"valid", VALID},
diff --git a/printconf.c b/printconf.c
index 184a5df2dc1..2edfb06edd4 100644
--- a/printconf.c
+++ b/printconf.c
@@ -78,6 +78,8 @@ print_ra_options(const char *indent, const struct ra_options_conf *ra_options)
printf("%srouter lifetime %d\n", indent, ra_options->router_lifetime);
printf("%sreachable time %u\n", indent, ra_options->reachable_time);
printf("%sretrans timer %u\n", indent, ra_options->retrans_timer);
+ printf("%ssource link-layer address %s\n", indent,
+ yesno(ra_options->source_link_addr));
if (ra_options->mtu > 0)
printf("%smtu %u\n", indent, ra_options->mtu);
diff --git a/rad.c b/rad.c
index b283e58fbf1..9093932c563 100644
--- a/rad.c
+++ b/rad.c
@@ -757,6 +757,7 @@ config_new_empty(void)
xconf->ra_options.router_lifetime = ADV_DEFAULT_LIFETIME;
xconf->ra_options.reachable_time = 0;
xconf->ra_options.retrans_timer = 0;
+ xconf->ra_options.source_link_addr = 1;
xconf->ra_options.mtu = 0;
xconf->ra_options.rdns_lifetime = DEFAULT_RDNS_LIFETIME;
SIMPLEQ_INIT(&xconf->ra_options.ra_rdnss_list);
diff --git a/rad.conf.5 b/rad.conf.5
index f82ccfd216e..a2589cda88c 100644
--- a/rad.conf.5
+++ b/rad.conf.5
@@ -127,6 +127,10 @@ The default is medium.
.\" XXX
.\" .It Ic retrans timer Ar number
.\" XXX
+.It Ic source link-layer address Pq Ic yes Ns | Ns Ic no
+Add a source link-layer address option to router advertisement messages, to
+communicate the link-layer address of the sending interface.
+The default is yes.
.El
.Sh INTERFACES
A list of interfaces or interface groups to send advertisements on:
diff --git a/rad.h b/rad.h
index 787e78c7c4a..61187dba273 100644
--- a/rad.h
+++ b/rad.h
@@ -97,6 +97,7 @@ struct ra_options_conf {
int router_lifetime; /* default router lifetime */
uint32_t reachable_time;
uint32_t retrans_timer;
+ int source_link_addr; /* source link-layer address */
uint32_t mtu;
uint32_t rdns_lifetime;
SIMPLEQ_HEAD(, ra_rdnss_conf) ra_rdnss_list;
add source link-layer address option to rad(8)