From: David Gwynne Subject: RFC 5517 PVLAN support for veb(4) To: tech@openbsd.org Date: Sat, 15 Nov 2025 13:26:49 +1000 this builds on the vlan awareness that i recently added to veb(4) to implement Private VLANs as per RFC 5517. what pvlan means is probably best explained in the rfc, so i won't try and do it again here, instead i'll try and explain how it fits in veb(4). veb(4) (and bridge(4)) currently support isolation via "protected domains", which can be configured to provide isolation between ports in a bridge. protected domains have a few shortcomings compared to pvlan. firstly, pvlan is standardised and has been implemented in a bunch of other products. this means there's a common understanding in the wider community about how isolation works within an ethernet network, and that's defined by pvlan. the protected domains semantics are different to pvlan and seems unique to openbsd in my experience. secondly, pvlan allows multiple devices to participate and enforce isolation across a wider network, while protected domains only provide isolation between ports within a single instance of veb(4) (or bridge). further to this, protected domains isolate ports before vlan tag processing is done, while pvlan provides isolation within vlans. both of these together mean that veb can be used as part of a larger pvlan environment. eg, we could run virtual machines in vmd on pvlan isolated ports in a veb, and that isolation could be carried to physical switches that can do pvlan and into another hypervisor that also understands pvlan, or a primary port on a switch where routing and firewalling happens for the whole environment. protected domains do have at least one advantage over pvlan, which is that it can be used to build full mesh l2 bridge topologies between multiple devices by preventing loops of BUM traffic. so i see them as complementary functionality, not competing. the implementation extends the core etherbridge functionality so it stores the extra information needed to be usable by a pvlan aware bridge. in practice this means address entries in etherbridges store a secondary vlan id. a primary vlan in a pvlan topology functions the same as a normal vlan, so the diff treats everything is a primary vlan in a pvlan for the normal case, so all the other users of the etherbridge code can just ignore the extra vlan info. because normal vlans look the same as primary pvlans, the veb forwarding code just does pvlan all the time, but tries to optimise for the primary vlan case. the changes to the forwarding path are surprisingly small, the bulk of the code ended up being around the ioctls. pvlans are configured on the veb interface, so all ports participating in the bridge are subject to that config. to configure pvlan on vlan 900 with 901 used for isolated ports: # ifconfig veb0 pvlan 900 # ifconfig veb0 pvlan-isolated 900 901 # ifconfig veb0 veb0: flags=8843 index 7 llprio 3 encap: vnetid none txprio packet rxprio outer groups: veb pvlan 900 isolated 901 vmx0 flags=3 port 1 ifpriority 0 ifcost 0 -untagged tagged: 780-781,871 vport0 flags=2000 port 8 ifpriority 0 ifcost 0 untagged 871 etherip0 flags=4003 port 10 ifpriority 0 ifcost 0 -untagged tagged: 900-901 vport900 flags=3 port 11 ifpriority 0 ifcost 0 untagged 900 vport901 flags=3 port 12 ifpriority 0 ifcost 0 untagged 901 vport911 flags=3 port 13 ifpriority 0 ifcost 0 untagged 901 in this setup all the vport interfaces are in separate rdomains. vport900 can talk to the ips on vport901 and vport911 and visa versa, but vport901 and vport911 are isolated from each other. as expected. the veb and vport topology is replicated on another host and connected ("trunked") by the etherip interface. vport900 on this host as a port on a primary pvlan host can talk to vport900, vport901 and vport911 on the other host, but vport901 and vport911 on this host can't talk to vport901 and vport911 on the other host because they're in the same isolated pvlan. community ports have been tested and function as described in the rfc too. the rfc assumes that all the vlan aware devices in a pvlan network are pvlan aware, however, in practice this is not true. there are a lot of routers and firewalls that understand vlans because it's handy to be able to route between vlans without having to have a physical interface per vlan. every pvlan capable switch ive used has a way to deal with this, which is basically mark a trunk to such a device as only working with primary pvlan tags. this code also provides that with the "pvptags" option you can set on veb ports. ifconfig veb0 pvptags etherip0 sets it, ifconfig veb0 -pvptags etherip0 clears it. having this functionality in veb also makes it easy to build test environments or labs for pvlan. you can build topologies out of a veb with multiple vports in different rdomains or tap interfaces plugged into virtual machines (or any ethernet interface) and see what does and doesn't go where with tcpdump. i couldnt find this anywhere else. Index: sbin/ifconfig/brconfig.c =================================================================== RCS file: /cvs/src/sbin/ifconfig/brconfig.c,v diff -u -p -r1.41 brconfig.c --- sbin/ifconfig/brconfig.c 4 Nov 2025 02:00:26 -0000 1.41 +++ sbin/ifconfig/brconfig.c 15 Nov 2025 00:27:42 -0000 @@ -61,10 +61,11 @@ void bridge_badrule(int, char **, int); void bridge_showrule(struct ifbrlreq *); int bridge_arprule(struct ifbrlreq *, int *, char ***); void bridge_vidmap(const char *); +void bridge_pvlans(const char *); #define IFBAFBITS "\020\1STATIC" #define IFBIFBITS \ -"\020\1LEARNING\2DISCOVER\3BLOCKNONIP\4STP\5EDGE\6AUTOEDGE\7PTP\10AUTOPTP\11SPAN\15LOCAL\16LOCKED" +"\020\1LEARNING\2DISCOVER\3BLOCKNONIP\4STP\5EDGE\6AUTOEDGE\7PTP\10AUTOPTP\11SPAN\15LOCAL\16LOCKED\17PVPTAGS" #define PV2ID(pv, epri, eaddr) do { \ epri = pv >> 48; \ @@ -147,6 +148,18 @@ unsetlocked(const char *val, int d) } void +setpvptags(const char *val, int d) +{ + bridge_ifsetflag(val, IFBIF_PVLAN_PTAGS); +} + +void +unsetpvptags(const char *val, int d) +{ + bridge_ifclrflag(val, IFBIF_PVLAN_PTAGS); +} + +void setstp(const char *val, int d) { bridge_ifsetflag(val, IFBIF_STP); @@ -698,7 +711,94 @@ bridge_unset_vidmap(const char *ifsname, if (ioctl(sock, SIOCBRDGSVMAP, &ifbrvm) == -1) err(1, "%s -tagged %s", ifname, ifsname); +} + +static void +bridge_pvlan_primary_op(const char *primary, const char *op, long cmd) +{ + struct ifbrpvlan ifbrpv; + uint16_t vid; + const char *errstr; + + vid = strtonum(primary, EVL_VLID_MIN, EVL_VLID_MAX, &errstr); + if (errstr != NULL) + errx(1, "primary vid: %s", errstr); + + memset(&ifbrpv, 0, sizeof(ifbrpv)); + + strlcpy(ifbrpv.ifbrpv_name, ifname, sizeof(ifbrpv.ifbrpv_name)); + ifbrpv.ifbrpv_primary = vid; + ifbrpv.ifbrpv_type = IFBRPV_T_PRIMARY; + + if (ioctl(sock, cmd, &ifbrpv) == -1) + err(1, "%s %s %s", ifname, op, primary); +} + +void +bridge_pvlan_primary(const char *primary, int d) +{ + bridge_pvlan_primary_op(primary, "pvlan", SIOCBRDGADDPV); +} + +void +bridge_unpvlan_primary(const char *primary, int d) +{ + bridge_pvlan_primary_op(primary, "-pvlan", SIOCBRDGDELPV); +} + +static void +bridge_pvlan_secondary_op(const char *primary, const char *secondary, + int type, const char *op, long cmd) +{ + struct ifbrpvlan ifbrpv; + uint16_t vp, vs; + const char *errstr; + + vp = strtonum(primary, EVL_VLID_MIN, EVL_VLID_MAX, &errstr); + if (errstr != NULL) + errx(1, "primary vid: %s", errstr); + + vs = strtonum(secondary, EVL_VLID_MIN, EVL_VLID_MAX, &errstr); + if (errstr != NULL) + errx(1, "secondary vid: %s", errstr); + + memset(&ifbrpv, 0, sizeof(ifbrpv)); + + strlcpy(ifbrpv.ifbrpv_name, ifname, sizeof(ifbrpv.ifbrpv_name)); + ifbrpv.ifbrpv_primary = vp; + ifbrpv.ifbrpv_secondary = vs; + ifbrpv.ifbrpv_type = type; + if (ioctl(sock, cmd, &ifbrpv) == -1) + err(1, "%s %s %s %s", ifname, op, primary, secondary); +} + +void +bridge_pvlan_isolated(const char *primary, const char *secondary) +{ + bridge_pvlan_secondary_op(primary, secondary, IFBRPV_T_ISOLATED, + "pvlan-isolated", SIOCBRDGADDPV); +} + +void +bridge_unpvlan_isolated(const char *primary, const char *secondary) +{ + bridge_pvlan_secondary_op(primary, secondary, IFBRPV_T_ISOLATED, + "-pvlan-isolated", SIOCBRDGDELPV); +} + +void +bridge_pvlan_community(const char *primary, const char *secondary) +{ + bridge_pvlan_secondary_op(primary, secondary, IFBRPV_T_COMMUNITY, + "pvlan-community", SIOCBRDGADDPV); +} + +void +bridge_unpvlan_community(const char *primary, const char *secondary) +{ + bridge_pvlan_secondary_op(primary, secondary, IFBRPV_T_COMMUNITY, + "-pvlan-community", SIOCBRDGDELPV); } void @@ -1168,6 +1268,7 @@ bridge_status(void) return; bridge_cfg("\t"); + bridge_pvlans("\t"); bridge_list("\t"); if (aflag && !ifaliases) @@ -1238,6 +1339,66 @@ bridge_vidmap(const char *ifsname) if (sep == ' ') printf(" none"); printf("\n"); +} + +void +bridge_pvlans(const char *delim) +{ + uint16_t vp = 0, vs; + struct ifbrpvlan ifbrpv; + + for (;;) { + const char *sep = " community "; + memset(&ifbrpv, 0, sizeof(ifbrpv)); + + strlcpy(ifbrpv.ifbrpv_name, ifname, sizeof(ifbrpv.ifbrpv_name)); + ifbrpv.ifbrpv_primary = ++vp; /* lucky vids start at 1 */ + ifbrpv.ifbrpv_type = IFBRPV_T_PRIMARY; + + if (ioctl(sock, SIOCBRDGNFINDPV, &ifbrpv) == -1) { + if (errno == ENOENT) { + /* all done */ + return; + } + + warn("%s SIOCBRDGNFINDPV %u", ifname, vp); + return; + } + + printf("%spvlan %u isolated", delim, ifbrpv.ifbrpv_primary); + if (ifbrpv.ifbrpv_secondary != EVL_VLID_NULL) + printf(" %u", ifbrpv.ifbrpv_secondary); + else + printf(" none"); + + vp = ifbrpv.ifbrpv_primary; + vs = 0; + + for (;;) { + strlcpy(ifbrpv.ifbrpv_name, ifname, + sizeof(ifbrpv.ifbrpv_name)); + ifbrpv.ifbrpv_primary = vp; + ifbrpv.ifbrpv_secondary = ++vs; + ifbrpv.ifbrpv_type = IFBRPV_T_COMMUNITY; + + if (ioctl(sock, SIOCBRDGNFINDPV, &ifbrpv) == -1) { + if (errno == ENOENT) { + /* all done */ + break; + } + + warn("%s SIOCBRDGNFINDPV %u %u", ifname, + vp, vs); + break; + } + + printf("%s%u", sep, ifbrpv.ifbrpv_secondary); + vs = ifbrpv.ifbrpv_secondary; + sep = ","; + } + + printf("\n"); + } } void Index: sbin/ifconfig/ifconfig.c =================================================================== RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v diff -u -p -r1.478 ifconfig.c --- sbin/ifconfig/ifconfig.c 1 Nov 2025 10:14:21 -0000 1.478 +++ sbin/ifconfig/ifconfig.c 15 Nov 2025 00:27:42 -0000 @@ -564,6 +564,8 @@ const struct cmd { { "-learn", NEXTARG, 0, unsetlearn }, { "locked", NEXTARG, 0, setlocked }, { "-locked", NEXTARG, 0, unsetlocked }, + { "pvptags", NEXTARG, 0, setpvptags }, + { "-pvptags", NEXTARG, 0, unsetpvptags }, { "stp", NEXTARG, 0, setstp }, { "-stp", NEXTARG, 0, unsetstp }, { "edge", NEXTARG, 0, setedge }, @@ -576,6 +578,16 @@ const struct cmd { { "-untagged", NEXTARG, 0, bridge_unpvid }, { "tagged", NEXTARG2, 0, NULL, bridge_set_vidmap }, { "-tagged", NEXTARG2, 0, bridge_unset_vidmap }, + { "pvlan", NEXTARG, 0, bridge_pvlan_primary }, + { "-pvlan", NEXTARG, 0, bridge_unpvlan_primary }, + { "pvlan-isolated", + NEXTARG2, 0, NULL, bridge_pvlan_isolated }, + { "-pvlan-isolated", + NEXTARG2, 0, NULL, bridge_unpvlan_isolated }, + { "pvlan-community", + NEXTARG2, 0, NULL, bridge_pvlan_community }, + { "-pvlan-community", + NEXTARG2, 0, NULL, bridge_unpvlan_community }, { "ptp", NEXTARG, 0, setptp }, { "-ptp", NEXTARG, 0, unsetptp }, { "autoptp", NEXTARG, 0, setautoptp }, Index: sbin/ifconfig/ifconfig.h =================================================================== RCS file: /cvs/src/sbin/ifconfig/ifconfig.h,v diff -u -p -r1.8 ifconfig.h --- sbin/ifconfig/ifconfig.h 1 Nov 2025 10:14:21 -0000 1.8 +++ sbin/ifconfig/ifconfig.h 15 Nov 2025 00:27:42 -0000 @@ -31,6 +31,8 @@ void setlearn(const char *, int); void unsetlearn(const char *, int); void setlocked(const char *, int); void unsetlocked(const char *, int); +void setpvptags(const char *, int); +void unsetpvptags(const char *, int); void setstp(const char *, int); void unsetstp(const char *, int); void setedge(const char *, int); @@ -65,6 +67,12 @@ void bridge_pvid(const char *, const cha void bridge_unpvid(const char *, int); void bridge_set_vidmap(const char *, const char *); void bridge_unset_vidmap(const char *, int); +void bridge_pvlan_primary(const char *, int); +void bridge_unpvlan_primary(const char *, int); +void bridge_pvlan_isolated(const char *, const char *); +void bridge_unpvlan_isolated(const char *, const char *); +void bridge_pvlan_community(const char *, const char *); +void bridge_unpvlan_community(const char *, const char *); void bridge_proto(const char *, int); void bridge_ifprio(const char *, const char *); void bridge_ifcost(const char *, const char *); Index: sys/net/if.c =================================================================== RCS file: /cvs/src/sys/net/if.c,v diff -u -p -r1.749 if.c --- sys/net/if.c 13 Nov 2025 23:30:01 -0000 1.749 +++ sys/net/if.c 15 Nov 2025 00:27:42 -0000 @@ -2473,6 +2473,8 @@ forceup: case SIOCBRDGSTXHC: case SIOCBRDGSPROTO: case SIOCBRDGSVMAP: + case SIOCBRDGADDPV: + case SIOCBRDGDELPV: #endif if ((error = suser(p)) != 0) break; Index: sys/net/if_bpe.c =================================================================== RCS file: /cvs/src/sys/net/if_bpe.c,v diff -u -p -r1.26 if_bpe.c --- sys/net/if_bpe.c 1 Nov 2025 10:04:49 -0000 1.26 +++ sys/net/if_bpe.c 15 Nov 2025 00:27:42 -0000 @@ -710,7 +710,7 @@ bpe_add_addr(struct bpe_softc *sc, const /* check endpoint for multicast or broadcast? */ return (etherbridge_add_addr(&sc->sc_eb, (void *)endpoint, - 0, &ifba->ifba_dst, type)); + 0, 0, &ifba->ifba_dst, type)); } static int @@ -770,7 +770,7 @@ bpe_input(struct ifnet *ifp0, struct mbu ceh = (struct ether_header *)(itagp + 1); etherbridge_map_ea(&sc->sc_eb, ceh->ether_shost, - 0, (struct ether_addr *)beh->ether_shost); + 0, 0, (struct ether_addr *)beh->ether_shost); m_adj(m, sizeof(*beh) + sizeof(*itagp)); Index: sys/net/if_bridge.h =================================================================== RCS file: /cvs/src/sys/net/if_bridge.h,v diff -u -p -r1.76 if_bridge.h --- sys/net/if_bridge.h 2 Nov 2025 00:15:20 -0000 1.76 +++ sys/net/if_bridge.h 15 Nov 2025 00:27:42 -0000 @@ -82,6 +82,7 @@ struct ifbreq { #define IFBIF_SPAN 0x0100 /* ifs is a span port (ro) */ #define IFBIF_LOCAL 0x1000 /* local port in switch(4) */ #define IFBIF_LOCKED 0x2000 /* restrict rx src mac with fib */ +#define IFBIF_PVLAN_PTAGS 0x4000 /* only use tags for primary pvlans */ #define IFBIF_RO_MASK 0x0f00 /* read only bits */ /* SIOCBRDGFLUSH */ @@ -269,6 +270,18 @@ struct ifbrvidmap { #define IFBRVM_OP_ANDNOT 0x2 /* kernel &= ~ifbrvm_map */ unsigned int ifbrvm_gen; uint8_t ifbrvm_map[512]; +}; + +struct ifbrpvlan { + char ifbrpv_name[IFNAMSIZ]; + uint16_t ifbrpv_primary; + uint16_t ifbrpv_secondary; + unsigned int ifbrpv_type; +#define IFBRPV_T_PRIMARY 0 +#define IFBRPV_T_SECONDARY 1 /* for searching */ +#define IFBRPV_T_ISOLATED 2 +#define IFBRPV_T_COMMUNITY 3 + unsigned int ifbrpv_gen; }; #ifdef _KERNEL Index: sys/net/if_etherbridge.c =================================================================== RCS file: /cvs/src/sys/net/if_etherbridge.c,v diff -u -p -r1.9 if_etherbridge.c --- sys/net/if_etherbridge.c 1 Nov 2025 10:04:49 -0000 1.9 +++ sys/net/if_etherbridge.c 15 Nov 2025 00:27:42 -0000 @@ -239,21 +239,33 @@ ebe_free(void *arg) } void * -etherbridge_resolve_ea(struct etherbridge *eb, uint16_t vid, +etherbridge_resolve_ea(struct etherbridge *eb, uint16_t pv, const struct ether_addr *ea) { - return (etherbridge_resolve(eb, vid, ether_addr_to_e64(ea))); + return (etherbridge_resolve(eb, pv, ether_addr_to_e64(ea))); } void * -etherbridge_resolve(struct etherbridge *eb, uint16_t vid, uint64_t eba) +etherbridge_resolve(struct etherbridge *eb, uint16_t vp, uint64_t eba) +{ + struct eb_entry *ebe; + + ebe = etherbridge_resolve_entry(eb, vp, eba); + if (ebe == NULL) + return (NULL); + + return (ebe->ebe_port); +} + +struct eb_entry * +etherbridge_resolve_entry(struct etherbridge *eb, uint16_t vp, uint64_t eba) { struct eb_list *ebl; struct eb_entry *ebe; SMR_ASSERT_CRITICAL(); - eba |= (uint64_t)vid << 48; + eba |= (uint64_t)vp << 48; ebl = etherbridge_list(eb, eba); ebe = ebl_find(ebl, eba); if (ebe != NULL) { @@ -263,21 +275,22 @@ etherbridge_resolve(struct etherbridge * return (NULL); } - return (ebe->ebe_port); + return (ebe); } return (NULL); } void -etherbridge_map_ea(struct etherbridge *eb, void *port, uint16_t vid, - const struct ether_addr *ea) +etherbridge_map_ea(struct etherbridge *eb, void *port, + uint16_t vp, uint16_t vs, const struct ether_addr *ea) { - etherbridge_map(eb, port, vid, ether_addr_to_e64(ea)); + etherbridge_map(eb, port, vp, vs, ether_addr_to_e64(ea)); } void -etherbridge_map(struct etherbridge *eb, void *port, uint16_t vid, uint64_t eba) +etherbridge_map(struct etherbridge *eb, void *port, + uint16_t vp, uint16_t vs, uint64_t eba) { struct eb_list *ebl; struct eb_entry *oebe, *nebe; @@ -291,7 +304,7 @@ etherbridge_map(struct etherbridge *eb, now = getuptime(); - eba |= (uint64_t)vid << 48; + eba |= (uint64_t)vp << 48; ebl = etherbridge_list(eb, eba); smr_read_enter(); @@ -335,6 +348,7 @@ etherbridge_map(struct etherbridge *eb, nebe->ebe_addr = eba; nebe->ebe_port = nport; + nebe->ebe_vs = vs; nebe->ebe_type = EBE_DYNAMIC; nebe->ebe_created = now; nebe->ebe_age = now; @@ -390,9 +404,9 @@ etherbridge_map(struct etherbridge *eb, int etherbridge_add_addr(struct etherbridge *eb, void *port, - uint16_t vid, const struct ether_addr *ea, unsigned int type) + uint16_t vp, uint16_t vs, const struct ether_addr *ea, unsigned int type) { - uint64_t eba = ether_addr_to_e64(ea) | (uint64_t)vid << 48; + uint64_t eba = ether_addr_to_e64(ea) | (uint64_t)vp << 48; struct eb_list *ebl; struct eb_entry *nebe; unsigned int num; @@ -420,6 +434,7 @@ etherbridge_add_addr(struct etherbridge nebe->ebe_addr = eba; nebe->ebe_port = nport; + nebe->ebe_vs = vs; nebe->ebe_type = type; nebe->ebe_created = now; nebe->ebe_age = now; @@ -450,10 +465,10 @@ etherbridge_add_addr(struct etherbridge return (error); } int -etherbridge_del_addr(struct etherbridge *eb, uint16_t vid, +etherbridge_del_addr(struct etherbridge *eb, uint16_t vp, const struct ether_addr *ea) { - uint64_t eba = ether_addr_to_e64(ea) | (uint64_t)vid << 48; + uint64_t eba = ether_addr_to_e64(ea) | (uint64_t)vp << 48; struct eb_list *ebl; struct eb_entry *oebe; const struct eb_entry key = { @@ -528,7 +543,9 @@ etherbridge_age(void *arg) } void -etherbridge_detach_port(struct etherbridge *eb, void *port) +etherbridge_filter(struct etherbridge *eb, + int (*filter)(struct etherbridge *, struct eb_entry *, void *), + void *cookie) { struct eb_entry *ebe, *nebe; struct eb_queue ebq = TAILQ_HEAD_INITIALIZER(ebq); @@ -539,7 +556,7 @@ etherbridge_detach_port(struct etherbrid mtx_enter(&eb->eb_lock); /* don't block map too much */ SMR_TAILQ_FOREACH_SAFE_LOCKED(ebe, ebl, ebe_lentry, nebe) { - if (!eb_port_eq(eb, ebe->ebe_port, port)) + if (!filter(eb, ebe, cookie)) continue; ebl_remove(ebl, ebe); @@ -568,46 +585,38 @@ etherbridge_detach_port(struct etherbrid } } -void -etherbridge_flush(struct etherbridge *eb, uint32_t flags) +static int +etherbridge_detach_port_filter(struct etherbridge *eb, struct eb_entry *ebe, + void *port) { - struct eb_entry *ebe, *nebe; - struct eb_queue ebq = TAILQ_HEAD_INITIALIZER(ebq); - size_t i; - - for (i = 0; i < ETHERBRIDGE_TABLE_SIZE; i++) { - struct eb_list *ebl = &eb->eb_table[i]; - - mtx_enter(&eb->eb_lock); /* don't block map too much */ - SMR_TAILQ_FOREACH_SAFE_LOCKED(ebe, ebl, ebe_lentry, nebe) { - if (flags == IFBF_FLUSHDYN && - ebe->ebe_type != EBE_DYNAMIC) - continue; + return (eb_port_eq(eb, ebe->ebe_port, port)); +} - ebl_remove(ebl, ebe); - ebt_remove(eb, ebe); - eb->eb_num--; +void +etherbridge_detach_port(struct etherbridge *eb, void *port) +{ + etherbridge_filter(eb, etherbridge_detach_port_filter, port); +} - /* we own the tables ref now */ +static int +etherbridge_flush_filter(struct etherbridge *eb, struct eb_entry *ebe, + void *cookie) +{ + uint32_t flags = (uintptr_t)cookie; - TAILQ_INSERT_TAIL(&ebq, ebe, ebe_qentry); - } - mtx_leave(&eb->eb_lock); - } + if (flags == IFBF_FLUSHALL) + return (1); - if (TAILQ_EMPTY(&ebq)) - return; + /* must be IFBF_FLUSHDYN */ + return (ebe->ebe_type == EBE_DYNAMIC); +} - /* - * do one smr barrier for all the entries rather than an - * smr_call each. - */ - smr_barrier(); +void +etherbridge_flush(struct etherbridge *eb, uint32_t flags) +{ + void *cookie = (void *)(uintptr_t)flags; - TAILQ_FOREACH_SAFE(ebe, &ebq, ebe_qentry, nebe) { - TAILQ_REMOVE(&ebq, ebe, ebe_qentry); - ebe_free(ebe); - } + etherbridge_filter(eb, etherbridge_flush_filter, cookie); } int @@ -705,7 +714,8 @@ etherbridge_vareq(struct etherbridge *eb ebe->ebe_port); bvareq.ifbva_created = ebe->ebe_created; bvareq.ifbva_used = ebe->ebe_age; - bvareq.ifbva_vid = ebe->ebe_addr >> 48; + /* report the secondary pvlan vid */ + bvareq.ifbva_vid = ebe->ebe_vs; ether_e64_to_addr(&bvareq.ifbva_dst, ebe->ebe_addr); memset(&bvareq.ifbva_dstsa, 0, sizeof(bvareq.ifbva_dstsa)); Index: sys/net/if_etherbridge.h =================================================================== RCS file: /cvs/src/sys/net/if_etherbridge.h,v diff -u -p -r1.6 if_etherbridge.h --- sys/net/if_etherbridge.h 1 Nov 2025 10:04:49 -0000 1.6 +++ sys/net/if_etherbridge.h 15 Nov 2025 00:27:42 -0000 @@ -44,6 +44,7 @@ struct eb_entry { uint64_t ebe_addr; void *ebe_port; + uint16_t ebe_vs; /* secondary vid */ unsigned int ebe_type; #define EBE_DYNAMIC 0x0 #define EBE_STATIC 0x1 @@ -55,6 +56,9 @@ struct eb_entry { struct smr_entry ebe_smr_entry; }; +#define etherbridge_port(_ebe) ((_ebe)->ebe_port) +#define etherbridge_vs(_ebe) ((_ebe)->ebe_vs) + SMR_TAILQ_HEAD(eb_list, eb_entry); RBT_HEAD(eb_tree, eb_entry); TAILQ_HEAD(eb_queue, eb_entry); @@ -81,13 +85,19 @@ int etherbridge_up(struct etherbridge * int etherbridge_down(struct etherbridge *); void etherbridge_destroy(struct etherbridge *); -void etherbridge_map(struct etherbridge *, void *, uint16_t, uint64_t); +void etherbridge_map(struct etherbridge *, void *, + uint16_t, uint16_t, uint64_t); void etherbridge_map_ea(struct etherbridge *, void *, - uint16_t, const struct ether_addr *); + uint16_t, uint16_t, const struct ether_addr *); +struct eb_entry * + etherbridge_resolve_entry(struct etherbridge *, + uint16_t, uint64_t); void *etherbridge_resolve(struct etherbridge *, uint16_t, uint64_t); void *etherbridge_resolve_ea(struct etherbridge *, uint16_t, const struct ether_addr *); void etherbridge_detach_port(struct etherbridge *, void *); +void etherbridge_filter(struct etherbridge *, + int (*)(struct etherbridge *, struct eb_entry *, void *), void *); /* ioctl support */ int etherbridge_set_max(struct etherbridge *, struct ifbrparam *); @@ -97,7 +107,7 @@ int etherbridge_get_tmo(struct etherbri int etherbridge_rtfind(struct etherbridge *, struct ifbaconf *); int etherbridge_vareq(struct etherbridge *, struct ifbaconf *); int etherbridge_add_addr(struct etherbridge *, void *, - uint16_t, const struct ether_addr *, unsigned int); + uint16_t, uint16_t, const struct ether_addr *, unsigned int); int etherbridge_del_addr(struct etherbridge *, uint16_t, const struct ether_addr *); void etherbridge_flush(struct etherbridge *, uint32_t); Index: sys/net/if_gre.c =================================================================== RCS file: /cvs/src/sys/net/if_gre.c,v diff -u -p -r1.191 if_gre.c --- sys/net/if_gre.c 1 Nov 2025 10:04:49 -0000 1.191 +++ sys/net/if_gre.c 15 Nov 2025 00:27:42 -0000 @@ -1460,7 +1460,7 @@ nvgre_input(const struct gre_tunnel *key eh = mtod(m, struct ether_header *); etherbridge_map_ea(&sc->sc_eb, (void *)&key->t_dst, - 0, (struct ether_addr *)eh->ether_shost); + 0, 0, (struct ether_addr *)eh->ether_shost); SET(m->m_pkthdr.csum_flags, M_FLOWID); m->m_pkthdr.ph_flowid = bemtoh32(&key->t_key) & ~GRE_KEY_ENTROPY; @@ -3711,7 +3711,7 @@ nvgre_add_addr(struct nvgre_softc *sc, c } return (etherbridge_add_addr(&sc->sc_eb, &endpoint, - 0, &ifba->ifba_dst, type)); + 0, 0, &ifba->ifba_dst, type)); } static int Index: sys/net/if_veb.c =================================================================== RCS file: /cvs/src/sys/net/if_veb.c,v diff -u -p -r1.52 if_veb.c --- sys/net/if_veb.c 7 Nov 2025 09:57:29 -0000 1.52 +++ sys/net/if_veb.c 15 Nov 2025 00:27:42 -0000 @@ -62,7 +62,7 @@ /* SIOCBRDGIFFLGS, SIOCBRDGIFFLGS */ #define VEB_IFBIF_FLAGS \ - (IFBIF_LOCKED|IFBIF_LEARNING|IFBIF_DISCOVER|IFBIF_BLOCKNONIP) + (IFBIF_PVLAN_PTAGS|IFBIF_LOCKED|IFBIF_LEARNING|IFBIF_DISCOVER|IFBIF_BLOCKNONIP) struct veb_rule { TAILQ_ENTRY(veb_rule) vr_entry; @@ -139,6 +139,17 @@ struct veb_ports { /* followed by an array of veb_port pointers */ }; +struct veb_pvlan { + RBT_ENTRY(veb_pvlan) v_entry; + uint16_t v_primary; + uint16_t v_secondary; +#define v_isolated v_secondary + unsigned int v_type; +}; + +RBT_HEAD(veb_pvlan_vp, veb_pvlan); +RBT_HEAD(veb_pvlan_vs, veb_pvlan); + struct veb_softc { struct ifnet sc_if; unsigned int sc_dead; @@ -151,6 +162,51 @@ struct veb_softc { struct rwlock sc_rule_lock; struct veb_ports *sc_ports; struct veb_ports *sc_spans; + + /* + * pvlan topology is stored twice: + * + * once in an array hanging off sc_pvlans for the forwarding path. + * entries in sc_pvlans are indexed by the secondary vid (Vs), and + * stores the primary vid (Vp) the Vs is associated with and the + * type of relationship Vs has with Vp. + * + * primary vids have an entry filled with their own vid to indicate + * that the vid is in use. + * + * vids without pvlan configuration have 0 in their sc_pvlans entry. + */ + uint16_t *sc_pvlans; +#define VEB_PVLAN_V_MASK EVL_VLID_MASK +#define VEB_PVLAN_T_PRIMARY (0 << 12) +#define VEB_PVLAN_T_ISOLATED (1 << 12) +#define VEB_PVLAN_T_COMMUNITY (2 << 12) +#define VEB_PVLAN_T_MASK (3 << 12) + + /* + * the pvlan topology is stored again in trees for the + * ioctls. technically the ioctl code could brute force through + * the sc_pvlans above, but this seemed like a good idea at + * the time. + * + * primary vids are stored in their own sc_pvlans_vp tree. + * there can only be one isolaved vid (Vi) per pvlan, which + * is managed using the v_isolated (v_secondary) id member + * in the primary veb_vplan struct here. + * + * secondary vids are stored in the sc_pvlans_vs tree. + * they're ordered by Vp, type, and Vs to make it easy to + * find pvlans for userland. + */ + struct veb_pvlan_vp sc_pvlans_vp; + struct veb_pvlan_vs sc_pvlans_vs; + + /* + * this is incremented when the pvlan topology changes, and + * copied into the FINDPV and NFINDPV ioctl results so userland + * can tell if a change has happened across multiple queries. + */ + unsigned int sc_pvlans_gen; }; #define DPRINTF(_sc, fmt...) do { \ @@ -216,6 +272,11 @@ static int veb_del_vid_addr(struct veb_s static int veb_get_vid_map(struct veb_softc *, struct ifbrvidmap *); static int veb_set_vid_map(struct veb_softc *, const struct ifbrvidmap *); +static int veb_add_pvlan(struct veb_softc *, const struct ifbrpvlan *); +static int veb_del_pvlan(struct veb_softc *, const struct ifbrpvlan *); +static int veb_find_pvlan(struct veb_softc *, struct ifbrpvlan *); +static int veb_nfind_pvlan(struct veb_softc *, struct ifbrpvlan *); + static int veb_rule_add(struct veb_softc *, const struct ifbrlreq *); static int veb_rule_list_flush(struct veb_softc *, const struct ifbrlreq *); @@ -239,6 +300,42 @@ static const struct etherbridge_ops veb_ veb_eb_port_sa, }; +static inline int +veb_pvlan_vp_cmp(const struct veb_pvlan *a, const struct veb_pvlan *b) +{ + if (a->v_primary < b->v_primary) + return (-1); + if (a->v_primary > b->v_primary) + return (1); + return (0); +} + +RBT_PROTOTYPE(veb_pvlan_vp, veb_pvlan, v_entry, veb_pvlan_vp_cmp); + +static inline int +veb_pvlan_vs_cmp(const struct veb_pvlan *a, const struct veb_pvlan *b) +{ + int rv; + + rv = veb_pvlan_vp_cmp(a, b); + if (rv != 0) + return (rv); + + if (a->v_type < b->v_type) + return (-1); + if (a->v_type > b->v_type) + return (1); + + if (a->v_secondary < b->v_secondary) + return (-1); + if (a->v_secondary > b->v_secondary) + return (1); + + return (0); +} + +RBT_PROTOTYPE(veb_pvlan_vs, veb_pvlan, v_entry, veb_pvlan_vs_cmp); + static struct if_clone veb_cloner = IF_CLONE_INITIALIZER("veb", veb_clone_create, veb_clone_destroy); @@ -291,6 +388,8 @@ veb_clone_create(struct if_clone *ifc, i rw_init(&sc->sc_rule_lock, "vebrlk"); sc->sc_ports = NULL; sc->sc_spans = NULL; + RBT_INIT(veb_pvlan_vp, &sc->sc_pvlans_vp); + RBT_INIT(veb_pvlan_vs, &sc->sc_pvlans_vs); ifp = &sc->sc_if; @@ -338,6 +437,7 @@ veb_clone_destroy(struct ifnet *ifp) struct veb_ports *mp, *ms; struct veb_port **ps; struct veb_port *p; + struct veb_pvlan *v, *nv; unsigned int i; NET_LOCK(); @@ -373,8 +473,8 @@ veb_clone_destroy(struct ifnet *ifp) veb_p_unlink(sc, ps[i]); } + smr_barrier(); /* everything everywhere all at once */ if (mp != NULL || ms != NULL) { - smr_barrier(); /* everything everywhere all at once */ if (mp != NULL) { refcnt_finalize(&mp->m_refs, "vebdtor"); @@ -409,6 +509,16 @@ veb_clone_destroy(struct ifnet *ifp) etherbridge_destroy(&sc->sc_eb); + RBT_FOREACH_SAFE(v, veb_pvlan_vp, &sc->sc_pvlans_vp, nv) { + RBT_REMOVE(veb_pvlan_vp, &sc->sc_pvlans_vp, v); + free(v, M_IFADDR, sizeof(*v)); + } + RBT_FOREACH_SAFE(v, veb_pvlan_vs, &sc->sc_pvlans_vs, nv) { + RBT_REMOVE(veb_pvlan_vs, &sc->sc_pvlans_vs, v); + free(v, M_IFADDR, sizeof(*v)); + } + free(sc->sc_pvlans, M_IFADDR, VEB_VID_COUNT * sizeof(*sc->sc_pvlans)); + free(sc, M_DEVBUF, sizeof(*sc)); return (0); @@ -701,19 +811,52 @@ veb_pf(struct ifnet *ifp0, int dir, stru } #endif /* NPF > 0 */ +struct veb_ctx { + struct netstack *ns; + struct veb_port *p; + uint64_t src; + uint64_t dst; + uint16_t vp; /* primary vlan */ + uint16_t vs; /* secondary vlan */ + uint16_t vt; /* secondary vlan type */ +}; + +static int +veb_pvlan_filter(const struct veb_ctx *ctx, uint16_t vs) +{ + switch (ctx->vt) { + case VEB_PVLAN_T_PRIMARY: + /* primary ports are permitted to send to anything */ + break; + + case VEB_PVLAN_T_COMMUNITY: + /* same communities are permitted */ + if (ctx->vs == vs) + break; + + /* FALLTHROUGH */ + case VEB_PVLAN_T_ISOLATED: + /* isolated (or community) can only send to a primary port */ + if (ctx->vp == vs) + break; + + return (1); + } + + return (0); +} + static void -veb_broadcast(struct veb_softc *sc, struct veb_port *rp, struct mbuf *m0, - uint64_t src, uint64_t dst, uint16_t vid, struct netstack *ns) +veb_broadcast(struct veb_softc *sc, struct veb_ctx *ctx, struct mbuf *m0) { struct ifnet *ifp = &sc->sc_if; struct veb_ports *pm; struct veb_port **ps; - struct veb_port *tp; struct ifnet *ifp0; struct mbuf *m; unsigned int i; - if (rp->p_pvid == vid) { /* XXX which vlan is the right one? */ + if (ctx->p->p_pvid == ctx->vs) { /* XXX which vlan is the right one? */ #if NPF > 0 /* * we couldn't find a specific port to send this packet to, @@ -721,7 +864,7 @@ veb_broadcast(struct veb_softc *sc, stru * let pf look at it, but use the veb interface as a proxy. */ if (ISSET(ifp->if_flags, IFF_LINK1) && - (m0 = veb_pf(ifp, PF_FWD, m0, ns)) == NULL) + (m0 = veb_pf(ifp, PF_FWD, m0, ctx->ns)) == NULL) return; #endif } @@ -739,9 +882,11 @@ veb_broadcast(struct veb_softc *sc, stru ps = veb_ports_array(pm); for (i = 0; i < pm->m_count; i++) { - tp = ps[i]; + struct veb_port *tp = ps[i]; + uint16_t pvid, vid; + unsigned int bif_flags; - if (rp == tp || (rp->p_protected & tp->p_protected)) { + if (ctx->p == tp || (ctx->p->p_protected & tp->p_protected)) { /* * don't let Ethernet packets hairpin or * move between ports in the same protected @@ -750,24 +895,45 @@ veb_broadcast(struct veb_softc *sc, stru continue; } - if (vid != tp->p_pvid) { - if (veb_vid_map_filter(tp, vid)) - continue; - } - ifp0 = tp->p_ifp0; if (!ISSET(ifp0->if_flags, IFF_RUNNING)) { /* don't waste time */ continue; } - if (!ISSET(tp->p_bif_flags, IFBIF_DISCOVER) && + bif_flags = READ_ONCE(tp->p_bif_flags); + + if (!ISSET(bif_flags, IFBIF_DISCOVER) && !ISSET(m0->m_flags, M_BCAST | M_MCAST)) { /* don't flood unknown unicast */ continue; } - if (veb_rule_filter(tp, VEB_RULE_LIST_OUT, m0, src, dst, vid)) + pvid = tp->p_pvid; + if (pvid < IFBR_PVID_MIN || pvid > IFBR_PVID_MAX || + veb_pvlan_filter(ctx, pvid)) { + if (ISSET(bif_flags, IFBIF_PVLAN_PTAGS)) { + /* + * port is attached to something that is + * vlan aware but pvlan unaware. only flood + * to the primary vid. + */ + vid = ctx->vp; + } else { + /* + * this must be an inter switch + * trunk, so use the original vid. + */ + vid = ctx->vs; + } + + if (veb_vid_map_filter(tp, vid)) + continue; + } else + vid = pvid; + + if (veb_rule_filter(tp, VEB_RULE_LIST_OUT, m0, + ctx->src, ctx->dst, vid)) continue; m = m_dup_pkt(m0, max_linkhdr + ETHER_ALIGN, M_NOWAIT); @@ -776,7 +942,9 @@ veb_broadcast(struct veb_softc *sc, stru continue; } - if (vid == tp->p_pvid) + if (pvid != vid) + m->m_pkthdr.ether_vtag |= vid; + else CLR(m->m_flags, M_VLANTAG); m = ether_offload_ifcap(ifp0, m); @@ -794,17 +962,17 @@ done: } static struct mbuf * -veb_transmit(struct veb_softc *sc, struct veb_port *rp, struct veb_port *tp, - struct mbuf *m, uint64_t src, uint64_t dst, uint16_t vid, - struct netstack *ns) +veb_transmit(struct veb_softc *sc, struct veb_ctx *ctx, struct mbuf *m, + struct veb_port *tp, uint16_t tvs) { struct ifnet *ifp = &sc->sc_if; struct ifnet *ifp0; + uint16_t pvid, vid = tvs; if (tp == NULL) return (m); - if (rp == tp || (rp->p_protected & tp->p_protected)) { + if (ctx->p == tp || (ctx->p->p_protected & tp->p_protected)) { /* * don't let Ethernet packets hairpin or move between * ports in the same protected domain(s). @@ -812,21 +980,43 @@ veb_transmit(struct veb_softc *sc, struc goto drop; } - /* pvid or tagged config can override address entries */ - if (vid != tp->p_pvid) { + if (veb_pvlan_filter(ctx, tvs)) + goto drop; + + /* address entries are still subject to tagged config */ + pvid = tp->p_pvid; + if (tvs != pvid) { + if (ISSET(tp->p_bif_flags, IFBIF_PVLAN_PTAGS)) { + /* + * this port is vlan aware but pvlan unaware, + * so it only understands the primary vlan. + */ + if (tvs != ctx->vp) + goto drop; + } else { + /* + * this must be an inter switch trunk, so use the + * original vid. + */ + vid = ctx->vs; + } + if (veb_vid_map_filter(tp, vid)) goto drop; } - if (veb_rule_filter(tp, VEB_RULE_LIST_OUT, m, src, dst, vid)) + if (veb_rule_filter(tp, VEB_RULE_LIST_OUT, m, + ctx->src, ctx->dst, vid)) goto drop; ifp0 = tp->p_ifp0; - if (vid == tp->p_pvid) { + if (tvs != pvid) + m->m_pkthdr.ether_vtag |= vid; + else { #if NPF > 0 if (ISSET(ifp->if_flags, IFF_LINK1) && - (m = veb_pf(ifp0, PF_FWD, m, ns)) == NULL) + (m = veb_pf(ifp0, PF_FWD, m, ctx->ns)) == NULL) return (NULL); #endif @@ -857,6 +1047,28 @@ veb_vport_input(struct ifnet *ifp0, stru return (m); } +static uint16_t +veb_pvlan(struct veb_softc *sc, uint16_t vid) +{ + uint16_t *pvlans; + uint16_t pvlan; + + /* + * a normal non-pvlan vlan operates like the primary vid in a pvlan, + * or visa versa. when doing a lookup we pretend that a non-pvlan vid + * is the primary vid in a pvlan. + */ + + pvlans = SMR_PTR_GET(&sc->sc_pvlans); + if (pvlans == NULL) + return (VEB_PVLAN_T_PRIMARY | vid); + + pvlan = pvlans[vid]; + if (pvlan == 0) + return (VEB_PVLAN_T_PRIMARY | vid); + + return (pvlan); +} static struct mbuf * veb_port_input(struct ifnet *ifp0, struct mbuf *m, uint64_t dst, void *brport, @@ -864,10 +1076,16 @@ veb_port_input(struct ifnet *ifp0, struc { struct veb_port *p = brport; struct veb_softc *sc = p->p_veb; + struct veb_ctx ctx = { + .ns = ns, + .p = p, + .dst = dst, + .vs = p->p_pvid, + }; struct ifnet *ifp = &sc->sc_if; struct ether_header *eh; - uint64_t src; - uint16_t vid = p->p_pvid; + unsigned int bif_flags; + uint16_t pvlan; int prio; #if NBPFILTER > 0 caddr_t if_bpf; @@ -924,12 +1142,13 @@ veb_port_input(struct ifnet *ifp0, struc uint16_t tvid = EVL_VLANOFTAG(m->m_pkthdr.ether_vtag); if (tvid == EVL_VLID_NULL) { + /* this preserves PRIOFTAG for BPF */ CLR(m->m_flags, M_VLANTAG); } else if (veb_vid_map_filter(p, tvid)) { /* count vlan tagged drop */ goto drop; } else - vid = tvid; + ctx.vs = tvid; prio = sc->sc_rxprio; switch (prio) { @@ -945,27 +1164,47 @@ veb_port_input(struct ifnet *ifp0, struc m->m_pkthdr.pf.prio = prio; break; } - } else if (vid == IFBR_PVID_DECLINE) - return (m); + } else { + /* prepare for BPF */ + m->m_pkthdr.ether_vtag = 0; + } - if (vid == IFBR_PVID_NONE) + if (ctx.vs == IFBR_PVID_DECLINE) + return (m); + if (ctx.vs == IFBR_PVID_NONE) goto drop; #ifdef DIAGNOSTIC - if (vid < IFBR_PVID_MIN || - vid > IFBR_PVID_MAX) { - panic("%s: %s vid %u is outside valid range", __func__, - ifp0->if_xname, vid); + if (ctx.vs < IFBR_PVID_MIN || + ctx.vs > IFBR_PVID_MAX) { + panic("%s: %s vid %u is outside valid range", __func__, + ifp0->if_xname, ctx.vs); } #endif - src = ether_addr_to_e64((struct ether_addr *)eh->ether_shost); + smr_read_enter(); + pvlan = veb_pvlan(sc, ctx.vs); + smr_read_leave(); + + ctx.vp = pvlan & VEB_PVLAN_V_MASK; + ctx.vt = pvlan & VEB_PVLAN_T_MASK; + ctx.src = ether_addr_to_e64((struct ether_addr *)eh->ether_shost); + + bif_flags = READ_ONCE(p->p_bif_flags); + + if (ISSET(bif_flags, IFBIF_PVLAN_PTAGS) && + ISSET(m->m_flags, M_VLANTAG) && + ctx.vt != VEB_PVLAN_T_PRIMARY) + goto drop; - if (ISSET(p->p_bif_flags, IFBIF_LOCKED)) { - struct veb_port *rp; + if (ISSET(bif_flags, IFBIF_LOCKED)) { + struct eb_entry *ebe; + struct veb_port *rp = NULL; smr_read_enter(); - rp = etherbridge_resolve(&sc->sc_eb, vid, src); + ebe = etherbridge_resolve_entry(&sc->sc_eb, ctx.vp, ctx.src); + if (ebe != NULL && ctx.vs == etherbridge_vs(ebe)) + rp = etherbridge_port(ebe); smr_read_leave(); if (rp != p) @@ -975,14 +1214,10 @@ veb_port_input(struct ifnet *ifp0, struc counters_pkt(ifp->if_counters, ifc_ipackets, ifc_ibytes, m->m_pkthdr.len); - /* - * set things up so we show BPF on veb which vlan this - * packet is on. i can't decide if the txprio or rxprio is - * better here, so i went with the third option of doing - * nothing. - dlg - */ - SET(m->m_flags, M_VLANTAG); - m->m_pkthdr.ether_vtag = vid; + if (!ISSET(m->m_flags, M_VLANTAG)) { + SET(m->m_flags, M_VLANTAG); /* for BPF */ + m->m_pkthdr.ether_vtag |= ctx.vs; + } /* force packets into the one routing domain for pf */ m->m_pkthdr.ph_rtableid = ifp->if_rdomain; @@ -997,7 +1232,7 @@ veb_port_input(struct ifnet *ifp0, struc veb_span(sc, m); - if (ISSET(p->p_bif_flags, IFBIF_BLOCKNONIP) && + if (ISSET(bif_flags, IFBIF_BLOCKNONIP) && veb_ip_filter(m)) goto drop; @@ -1005,39 +1240,43 @@ veb_port_input(struct ifnet *ifp0, struc veb_svlan_filter(m)) goto drop; - if (veb_rule_filter(p, VEB_RULE_LIST_IN, m, src, dst, vid)) + if (veb_rule_filter(p, VEB_RULE_LIST_IN, m, ctx.src, ctx.dst, ctx.vs)) goto drop; #if NPF > 0 if (ISSET(ifp->if_flags, IFF_LINK1) && - (m = veb_pf(ifp0, PF_IN, m, ns)) == NULL) + (m = veb_pf(ifp0, PF_IN, m, ctx.ns)) == NULL) return (NULL); #endif eh = mtod(m, struct ether_header *); - if (ISSET(p->p_bif_flags, IFBIF_LEARNING)) - etherbridge_map(&sc->sc_eb, p, vid, src); + if (ISSET(bif_flags, IFBIF_LEARNING)) + etherbridge_map(&sc->sc_eb, ctx.p, ctx.vp, ctx.vs, ctx.src); prio = sc->sc_txprio; prio = (prio == IF_HDRPRIO_PACKET) ? m->m_pkthdr.pf.prio : prio; /* IEEE 802.1p has prio 0 and 1 swapped */ if (prio <= 1) prio = !prio; - m->m_pkthdr.ether_vtag |= (prio << EVL_PRIO_BITS); + m->m_pkthdr.ether_vtag = (prio << EVL_PRIO_BITS); CLR(m->m_flags, M_BCAST|M_MCAST); - if (!ETH64_IS_MULTICAST(dst)) { + if (!ETH64_IS_MULTICAST(ctx.dst)) { + struct eb_entry *ebe; struct veb_port *tp = NULL; + uint16_t tvs = 0; smr_read_enter(); - tp = etherbridge_resolve(&sc->sc_eb, vid, dst); - if (tp != NULL) - veb_eb_port_take(NULL, tp); + ebe = etherbridge_resolve_entry(&sc->sc_eb, ctx.vp, ctx.dst); + if (ebe != NULL) { + tp = veb_eb_port_take(NULL, etherbridge_port(ebe)); + tvs = etherbridge_vs(ebe); + } smr_read_leave(); if (tp != NULL) { - m = veb_transmit(sc, p, tp, m, src, dst, vid, ns); + m = veb_transmit(sc, &ctx, m, tp, tvs); veb_eb_port_rele(NULL, tp); } @@ -1046,10 +1285,11 @@ veb_port_input(struct ifnet *ifp0, struc /* unknown unicast address */ } else { - SET(m->m_flags, ETH64_IS_BROADCAST(dst) ? M_BCAST : M_MCAST); + SET(m->m_flags, + ETH64_IS_BROADCAST(ctx.dst) ? M_BCAST : M_MCAST); } - veb_broadcast(sc, p, m, src, dst, vid, ns); + veb_broadcast(sc, &ctx, m); return (NULL); drop: @@ -1146,6 +1386,19 @@ veb_ioctl(struct ifnet *ifp, u_long cmd, ifr->ifr_hdrprio = sc->sc_rxprio; break; + case SIOCBRDGADDPV: + error = veb_add_pvlan(sc, (const struct ifbrpvlan *)data); + break; + case SIOCBRDGDELPV: + error = veb_del_pvlan(sc, (const struct ifbrpvlan *)data); + break; + case SIOCBRDGFINDPV: + error = veb_find_pvlan(sc, (struct ifbrpvlan *)data); + break; + case SIOCBRDGNFINDPV: + error = veb_nfind_pvlan(sc, (struct ifbrpvlan *)data); + break; + case SIOCBRDGADD: error = suser(curproc); if (error != 0) @@ -1662,7 +1915,7 @@ veb_chk_vid_map(const struct ifbrvidmap bit = 4095 % 8; if (ISSET(ifbrvm->ifbrvm_map[off], 1U << bit)) return (EINVAL); - + return (0); } @@ -1715,7 +1968,7 @@ veb_destroy_vid_map(uint32_t *map) { struct veb_vid_map_dtor *dtor; - dtor = malloc(sizeof(*dtor), M_TEMP, M_NOWAIT); + dtor = malloc(sizeof(*dtor), M_TEMP, M_NOWAIT); if (dtor == NULL) { /* oh well, the proc can sleep instead */ smr_barrier(); @@ -1825,6 +2078,353 @@ put: } static int +veb_vid_inuse(struct veb_softc *sc, uint16_t vid) +{ + struct veb_ports *pm; + struct veb_port **ps; + unsigned int off = vid / 32; + unsigned int bit = vid % 32; + unsigned int i; + + /* must be holding sc->sc_rule_lock */ + + pm = SMR_PTR_GET_LOCKED(&sc->sc_ports); + ps = veb_ports_array(pm); + for (i = 0; i < pm->m_count; i++) { + struct veb_port *p = ps[i]; + uint32_t *map; + + if (p->p_pvid == vid) + return (1); + + map = SMR_PTR_GET_LOCKED(&p->p_vid_map); + if (map != NULL && ISSET(map[off], 1U << bit)) + return (1); + } + + return (0); +} + +static int +veb_add_pvlan(struct veb_softc *sc, const struct ifbrpvlan *ifbrpv) +{ + struct veb_pvlan *v; + uint16_t *pvlans = NULL; + int error; + + if (ifbrpv->ifbrpv_primary < EVL_VLID_MIN || + ifbrpv->ifbrpv_primary > EVL_VLID_MAX) + return (EINVAL); + + switch (ifbrpv->ifbrpv_type) { + case IFBRPV_T_PRIMARY: + if (ifbrpv->ifbrpv_secondary != 0) + return (EINVAL); + break; + case IFBRPV_T_ISOLATED: + case IFBRPV_T_COMMUNITY: + if (ifbrpv->ifbrpv_secondary < EVL_VLID_MIN || + ifbrpv->ifbrpv_secondary > EVL_VLID_MAX) + return (EINVAL); + break; + default: + return (EINVAL); + } + + if (sc->sc_pvlans == NULL) { + pvlans = mallocarray(VEB_VID_COUNT, sizeof(*pvlans), + M_IFADDR, M_WAITOK|M_CANFAIL|M_ZERO); + if (pvlans == NULL) + return (ENOMEM); + } + + v = malloc(sizeof(*v), M_IFADDR, M_WAITOK|M_CANFAIL); + if (v == NULL) { + error = ENOMEM; + goto freepvlans; + } + + v->v_primary = ifbrpv->ifbrpv_primary; + v->v_secondary = ifbrpv->ifbrpv_secondary; + v->v_type = ifbrpv->ifbrpv_type; + + error = rw_enter(&sc->sc_rule_lock, RW_WRITE|RW_INTR); + if (error != 0) + goto free; + + if (sc->sc_pvlans == NULL) { + KASSERT(pvlans != NULL); + SMR_PTR_SET_LOCKED(&sc->sc_pvlans, pvlans); + pvlans = NULL; + } + + if (ifbrpv->ifbrpv_type == IFBRPV_T_PRIMARY) { + struct veb_pvlan *ovp; + + if (sc->sc_pvlans[v->v_primary] != 0) { + error = EBUSY; + goto err; + } + + ovp = RBT_INSERT(veb_pvlan_vp, &sc->sc_pvlans_vp, v); + if (ovp != NULL) { + panic("%s: %s %p pvlans and pvlans_vp inconsistency\n", + __func__, sc->sc_if.if_xname, sc); + } + + sc->sc_pvlans[v->v_primary] = VEB_PVLAN_T_PRIMARY | + v->v_primary; + } else { /* secondary */ + struct veb_pvlan *vp, *ovs; + uint16_t pve = v->v_primary; + + if (sc->sc_pvlans[v->v_secondary] != 0) { + error = EBUSY; + goto err; + } + + if (sc->sc_pvlans[v->v_primary] != v->v_primary) { + error = ENETUNREACH; /* XXX */ + goto err; + } + + vp = RBT_FIND(veb_pvlan_vp, &sc->sc_pvlans_vp, v); + if (vp == NULL) { + panic("%s: %s %p pvlans and pvlans_vp inconsistency\n", + __func__, sc->sc_if.if_xname, sc); + } + + if (veb_vid_inuse(sc, v->v_secondary)) { + error = EADDRINUSE; + goto err; + } + + if (ifbrpv->ifbrpv_type == IFBRPV_T_ISOLATED) { + if (vp->v_isolated != 0) { + error = EADDRNOTAVAIL; + goto err; + } + vp->v_isolated = v->v_secondary; + pve |= VEB_PVLAN_T_ISOLATED; + } else { /* IFBRPV_T_COMMUNITY */ + pve |= VEB_PVLAN_T_COMMUNITY; + } + + ovs = RBT_INSERT(veb_pvlan_vs, &sc->sc_pvlans_vs, v); + if (ovs != NULL) { + panic("%s: %s %p pvlans and pvlans_vs inconsistency\n", + __func__, sc->sc_if.if_xname, sc); + } + + sc->sc_pvlans[v->v_secondary] = pve; + } + sc->sc_pvlans_gen++; + v = NULL; + +err: + rw_exit(&sc->sc_rule_lock); +free: + free(v, M_IFADDR, sizeof(*v)); +freepvlans: + free(pvlans, M_IFADDR, VEB_VID_COUNT * sizeof(*pvlans)); + return (error); +} + +static int +veb_dev_pvlan_filter(struct etherbridge *eb, struct eb_entry *ebe, + void *cookie) +{ + struct veb_pvlan *vs = cookie; + + return (etherbridge_vs(ebe) == vs->v_secondary); +} + +static int +veb_del_pvlan(struct veb_softc *sc, const struct ifbrpvlan *ifbrpv) +{ + struct veb_pvlan key; + struct veb_pvlan *v = NULL; + struct veb_pvlan *vp, *vs; + uint16_t *pvlans; + uint16_t pve; + int error; + + if (ifbrpv->ifbrpv_primary < EVL_VLID_MIN || + ifbrpv->ifbrpv_primary > EVL_VLID_MAX) + return (EINVAL); + + switch (ifbrpv->ifbrpv_type) { + case IFBRPV_T_PRIMARY: + if (ifbrpv->ifbrpv_secondary != 0) + return (EINVAL); + break; + case IFBRPV_T_ISOLATED: + case IFBRPV_T_COMMUNITY: + if (ifbrpv->ifbrpv_secondary < EVL_VLID_MIN || + ifbrpv->ifbrpv_secondary > EVL_VLID_MAX) + return (EINVAL); + break; + default: + return (EINVAL); + } + + key.v_primary = ifbrpv->ifbrpv_primary; + key.v_secondary = ifbrpv->ifbrpv_secondary; + key.v_type = ifbrpv->ifbrpv_type; + + error = rw_enter(&sc->sc_rule_lock, RW_WRITE|RW_INTR); + if (error != 0) + return (error); + + pvlans = sc->sc_pvlans; + if (pvlans == NULL) { + error = ESRCH; + goto err; + } + + vp = RBT_FIND(veb_pvlan_vp, &sc->sc_pvlans_vp, &key); + if (vp == NULL) { + error = ESRCH; + goto err; + } + + if (ifbrpv->ifbrpv_type == IFBRPV_T_PRIMARY) { + vs = RBT_NFIND(veb_pvlan_vs, &sc->sc_pvlans_vs, &key); + if (vs != NULL && vs->v_primary == vp->v_primary) { + error = EBUSY; + goto err; + } + + v = vp; + KASSERT(v->v_isolated == 0); /* vs NFIND should found this */ + + pve = VEB_PVLAN_T_PRIMARY | v->v_primary; + if (sc->sc_pvlans[v->v_primary] != pve) { + panic("%s: %s %p pvlans and pvlans_vp inconsistency\n", + __func__, sc->sc_if.if_xname, sc); + } + + RBT_REMOVE(veb_pvlan_vp, &sc->sc_pvlans_vp, v); + sc->sc_pvlans[v->v_primary] = 0; + } else { /* secondary */ + uint16_t pve; + + vs = RBT_FIND(veb_pvlan_vs, &sc->sc_pvlans_vs, &key); + if (vs == NULL || vs->v_type != key.v_type) { + error = ESRCH; + goto err; + } + + if (veb_vid_inuse(sc, vs->v_secondary)) { + error = EBUSY; + goto err; + } + + v = vs; + pve = v->v_primary; + if (ifbrpv->ifbrpv_type == IFBRPV_T_ISOLATED) { + KASSERT(vp->v_isolated == v->v_secondary); + vp->v_isolated = 0; + + pve |= VEB_PVLAN_T_ISOLATED; + } else { /* community */ + pve |= VEB_PVLAN_T_COMMUNITY; + } + + if (sc->sc_pvlans[v->v_secondary] != pve) { + panic("%s: %s %p pvlans and pvlans_vs inconsistency\n", + __func__, sc->sc_if.if_xname, sc); + } + + RBT_REMOVE(veb_pvlan_vs, &sc->sc_pvlans_vs, v); + sc->sc_pvlans[v->v_secondary] = 0; + /* XXX smr_barrier for sc_pvlans entry use to end? */ + etherbridge_filter(&sc->sc_eb, veb_dev_pvlan_filter, v); + } + sc->sc_pvlans_gen++; + +err: + rw_exit(&sc->sc_rule_lock); + free(v, M_IFADDR, sizeof(*v)); + return (error); +} + +static int +veb_find_pvlan(struct veb_softc *sc, struct ifbrpvlan *ifbrpv) +{ + return (ENOTTY); +} + +static int +veb_nfind_pvlan_primary(struct veb_softc *sc, struct ifbrpvlan *ifbrpv) +{ + struct veb_pvlan key; + struct veb_pvlan *vp; + int error; + + if (ifbrpv->ifbrpv_secondary != 0) + return (EINVAL); + + key.v_primary = ifbrpv->ifbrpv_primary; + + error = rw_enter(&sc->sc_rule_lock, RW_READ|RW_INTR); + if (error != 0) + return (error); + + vp = RBT_NFIND(veb_pvlan_vp, &sc->sc_pvlans_vp, &key); + if (vp == NULL) { + error = ENOENT; + goto err; + } + + ifbrpv->ifbrpv_primary = vp->v_primary; + ifbrpv->ifbrpv_secondary = vp->v_isolated; + ifbrpv->ifbrpv_gen = sc->sc_pvlans_gen; + +err: + rw_exit(&sc->sc_rule_lock); + return (error); +} + +static int +veb_nfind_pvlan(struct veb_softc *sc, struct ifbrpvlan *ifbrpv) +{ + struct veb_pvlan key; + struct veb_pvlan *vs; + int error; + + if (ifbrpv->ifbrpv_type == IFBRPV_T_PRIMARY) + return (veb_nfind_pvlan_primary(sc, ifbrpv)); + + if (ifbrpv->ifbrpv_primary < EVL_VLID_MIN || + ifbrpv->ifbrpv_primary > EVL_VLID_MAX) + return (EINVAL); + + key.v_primary = ifbrpv->ifbrpv_primary; + key.v_secondary = ifbrpv->ifbrpv_secondary; + key.v_type = ifbrpv->ifbrpv_type; + + error = rw_enter(&sc->sc_rule_lock, RW_READ|RW_INTR); + if (error != 0) + return (error); + + vs = RBT_NFIND(veb_pvlan_vs, &sc->sc_pvlans_vs, &key); + if (vs == NULL || + vs->v_primary != ifbrpv->ifbrpv_primary || + vs->v_type != ifbrpv->ifbrpv_type) { + error = ENOENT; + goto err; + } + + ifbrpv->ifbrpv_secondary = vs->v_secondary; + ifbrpv->ifbrpv_gen = sc->sc_pvlans_gen; + +err: + rw_exit(&sc->sc_rule_lock); + return (error); +} + +static int veb_rule_add(struct veb_softc *sc, const struct ifbrlreq *ifbr) { const struct ifbrarpf *brla = &ifbr->ifbr_arpf; @@ -2231,7 +2831,7 @@ veb_add_addr(struct veb_softc *sc, const struct veb_port *p; int error = 0; unsigned int type; - uint16_t pvid; + uint16_t vp, vs; if (ISSET(ifba->ifba_flags, ~IFBAF_TYPEMASK)) return (EINVAL); @@ -2253,16 +2853,21 @@ veb_add_addr(struct veb_softc *sc, const if (p == NULL) return (ESRCH); - pvid = p->p_pvid; - if (pvid < IFBR_PVID_MIN || - pvid > IFBR_PVID_MAX) { + vs = p->p_pvid; + if (vs < IFBR_PVID_MIN || + vs > IFBR_PVID_MAX) { error = EADDRNOTAVAIL; goto put; } - error = etherbridge_add_addr(&sc->sc_eb, p, - pvid, &ifba->ifba_dst, type); + smr_read_enter(); + vp = veb_pvlan(sc, vs); + smr_read_leave(); + vp &= VEB_PVLAN_V_MASK; + + error = etherbridge_add_addr(&sc->sc_eb, p, + vp, vs, &ifba->ifba_dst, type); put: veb_port_put(sc, p); @@ -2275,7 +2880,7 @@ veb_add_vid_addr(struct veb_softc *sc, c struct veb_port *p; int error = 0; unsigned int type; - uint16_t vid; + uint16_t vp, vs; if (ISSET(ifbva->ifbva_flags, ~IFBAF_TYPEMASK)) return (EINVAL); @@ -2303,18 +2908,24 @@ veb_add_vid_addr(struct veb_softc *sc, c if (p == NULL) return (ESRCH); - vid = ifbva->ifbva_vid; - if (vid == EVL_VLID_NULL) { - vid = p->p_pvid; - if (vid < IFBR_PVID_MIN || - vid > IFBR_PVID_MAX) { + vs = ifbva->ifbva_vid; + if (vs == EVL_VLID_NULL) { + vs = p->p_pvid; + if (vs < IFBR_PVID_MIN || + vs > IFBR_PVID_MAX) { error = EADDRNOTAVAIL; goto put; } } + smr_read_enter(); + vp = veb_pvlan(sc, vs); + smr_read_leave(); + + vp &= VEB_PVLAN_V_MASK; + error = etherbridge_add_addr(&sc->sc_eb, p, - vid, &ifbva->ifbva_dst, type); + vp, vs, &ifbva->ifbva_dst, type); put: veb_port_put(sc, p); @@ -2325,24 +2936,39 @@ put: static int veb_del_addr(struct veb_softc *sc, const struct ifbareq *ifba) { - uint16_t vid = sc->sc_dflt_pvid; + uint16_t vp, vs; - if (vid == IFBR_PVID_NONE) + vs = sc->sc_dflt_pvid; + if (vs == IFBR_PVID_NONE) return (ESRCH); - return (etherbridge_del_addr(&sc->sc_eb, - vid, &ifba->ifba_dst)); + smr_read_enter(); + vp = veb_pvlan(sc, vs); + smr_read_leave(); + + vp &= VEB_PVLAN_V_MASK; + + return (etherbridge_del_addr(&sc->sc_eb, vp, &ifba->ifba_dst)); } static int veb_del_vid_addr(struct veb_softc *sc, const struct ifbvareq *ifbva) { - if (ifbva->ifbva_vid < EVL_VLID_MIN || - ifbva->ifbva_vid > EVL_VLID_MAX) + uint16_t vp, vs; + + vs = ifbva->ifbva_vid; + + if (vs < EVL_VLID_MIN || + vs > EVL_VLID_MAX) return (EINVAL); - return (etherbridge_del_addr(&sc->sc_eb, - ifbva->ifbva_vid, &ifbva->ifbva_dst)); + smr_read_enter(); + vp = veb_pvlan(sc, vs); + smr_read_leave(); + + vp &= VEB_PVLAN_V_MASK; + + return (etherbridge_del_addr(&sc->sc_eb, vp, &ifbva->ifbva_dst)); } static int @@ -2591,6 +3217,9 @@ veb_eb_port_sa(void *arg, struct sockadd { ss->ss_family = AF_UNSPEC; } + +RBT_GENERATE(veb_pvlan_vp, veb_pvlan, v_entry, veb_pvlan_vp_cmp); +RBT_GENERATE(veb_pvlan_vs, veb_pvlan, v_entry, veb_pvlan_vs_cmp); /* * virtual ethernet bridge port Index: sys/net/if_vxlan.c =================================================================== RCS file: /cvs/src/sys/net/if_vxlan.c,v diff -u -p -r1.105 if_vxlan.c --- sys/net/if_vxlan.c 1 Nov 2025 10:04:49 -0000 1.105 +++ sys/net/if_vxlan.c 15 Nov 2025 00:27:42 -0000 @@ -694,8 +694,8 @@ vxlan_input(void *arg, struct mbuf *m, s if (sc->sc_mode == VXLAN_TMODE_LEARNING) { eh = mtod(m, struct ether_header *); - etherbridge_map_ea(&sc->sc_eb, &addr, - 0, (struct ether_addr *)eh->ether_shost); + etherbridge_map_ea(&sc->sc_eb, &addr, 0, 0, + (struct ether_addr *)eh->ether_shost); } rxhprio = sc->sc_rxhprio; @@ -1721,7 +1721,7 @@ vxlan_add_addr(struct vxlan_softc *sc, c } return (etherbridge_add_addr(&sc->sc_eb, &endpoint, - 0, &ifba->ifba_dst, type)); + 0, 0, &ifba->ifba_dst, type)); } static int Index: sys/sys/sockio.h =================================================================== RCS file: /cvs/src/sys/sys/sockio.h,v diff -u -p -r1.85 sockio.h --- sys/sys/sockio.h 1 Nov 2025 09:46:31 -0000 1.85 +++ sys/sys/sockio.h 15 Nov 2025 00:27:42 -0000 @@ -75,7 +75,11 @@ #define SIOCGLIFPHYADDR _IOWR('i', 75, struct if_laddrreq) /* get gif addrs */ #define SIOCBRDGADD _IOW('i', 60, struct ifbreq) /* add bridge ifs */ +#define SIOCBRDGADDPV _IOW('i', 60, struct ifbrpvlan) /* add pvlan */ +#define SIOCBRDGFINDPV _IOWR('i', 60, struct ifbrpvlan) /* find pvlan */ #define SIOCBRDGDEL _IOW('i', 61, struct ifbreq) /* del bridge ifs */ +#define SIOCBRDGDELPV _IOW('i', 61, struct ifbrpvlan) /* del pvlan */ +#define SIOCBRDGNFINDPV _IOWR('i', 61, struct ifbrpvlan) /* nfind pvlan */ #define SIOCBRDGGIFFLGS _IOWR('i', 62, struct ifbreq) /* get brdg if flags */ #define SIOCBRDGSIFFLGS _IOW('i', 63, struct ifbreq) /* set brdg if flags */ #define SIOCBRDGSCACHE _IOW('i', 64, struct ifbrparam)/* set cache size */