Download raw body.
pf_route simplification, or pf_route vs pfsync
Hello,
On Sat, Jul 06, 2024 at 04:06:16PM +1000, David Gwynne wrote:
>
> my second attempt was to gut pf_route and have it jus add an mbuf
> tag with the dst address i want to use instead of the real ip. the
> actual ip stack is then modified to use the tag for the real route
> lookups.
>
> in my opinion this turned out really well. it's long bothered me that
> pf_route and pf_route6 are a large duplication of the ip_output code.
> there's been cases where we forget to update pf_route with new stack
> features. this problem goes away if we just use the stack for
> everything. a big chunk of code just goes away.
>
> pfsync defer with route-to works too.
>
> ive been running this in production since the end of april without
> issue. if there's a performance hit i haven't noticed it. mbuf tag
> lookups are cheap since henning added what's basically a bloom
> filter for them to mbufs.
>
> ok?
I think this makes sense. I don't object the idea. I will also appreciate
if more people familiar with network stack say their yes/no here.
I have some questions to the code itself see further below.
thanks and
regards
sashan
>
> Index: net/if_pfsync.c
> ===================================================================
> RCS file: /cvs/src/sys/net/if_pfsync.c,v
> diff -u -p -r1.326 if_pfsync.c
> --- net/if_pfsync.c 24 May 2024 06:38:41 -0000 1.326
> +++ net/if_pfsync.c 6 Jul 2024 05:16:51 -0000
> @@ -2068,13 +2068,16 @@ pfsync_deferrals_task(void *arg)
> static void
> pfsync_defer_output(struct pfsync_deferral *pd)
> {
> - struct pf_pdesc pdesc;
> struct pf_state *st = pd->pd_st;
> + struct mbuf *m = pd->pd_m;
> +
> + if (st->rt) {
> + struct pf_pdesc pdesc;
>
> - if (st->rt == PF_ROUTETO) {
> if (pf_setup_pdesc(&pdesc, st->key[PF_SK_WIRE]->af,
> - st->direction, NULL, pd->pd_m, NULL) != PF_PASS)
> - return;
> + st->direction, NULL, m, NULL) != PF_PASS)
> + goto done;
> +
the current code on pf_setup_pdesc() failure just returns.
Your change jumps to 'done' where reference to state is dropped.
this feels like a fix of reference leak on error in pf_setup_pdesc().
just checking if this is intentional change (sorry if I sound like
Cpt. Obvious here).
</snip>
> Index: net/pf.c
> ===================================================================
> RCS file: /cvs/src/sys/net/pf.c,v
> diff -u -p -r1.1201 pf.c
> --- net/pf.c 4 Jul 2024 12:50:08 -0000 1.1201
> +++ net/pf.c 6 Jul 2024 05:16:51 -0000
> @@ -6568,12 +6568,7 @@ void
> pf_route(struct pf_pdesc *pd, struct pf_state *st)
> {
> struct mbuf *m0;
> - struct mbuf_list ml;
> - struct sockaddr_in *dst, sin;
> - struct rtentry *rt = NULL;
> - struct ip *ip;
> - struct ifnet *ifp = NULL;
> - unsigned int rtableid;
> + struct m_tag *mtag;
>
> if (pd->m->m_pkthdr.pf.routed++ > 3) {
> m_freem(pd->m);
> @@ -6582,117 +6577,47 @@ pf_route(struct pf_pdesc *pd, struct pf_
> }
>
> if (st->rt == PF_DUPTO) {
> - if ((m0 = m_dup_pkt(pd->m, max_linkhdr, M_NOWAIT)) == NULL)
> - return;
> - } else {
> - if ((st->rt == PF_REPLYTO) == (st->direction == pd->dir))
> + m0 = m_dup_pkt(pd->m, max_linkhdr, M_NOWAIT);
> + if (m0 == NULL)
> return;
> + } else if ((st->rt == PF_REPLYTO) == (st->direction == pd->dir))
> + return;
> + else
> m0 = pd->m;
</snip>
> - ip = mtod(m0, struct ip *);
> + mtag = m_tag_find(m0, PACKET_TAG_PF_ROUTE, NULL);
> + if (mtag == NULL) {
> + mtag = m_tag_get(PACKET_TAG_PF_ROUTE, sizeof(st->rt_addr),
> + M_NOWAIT);
> + if (mtag == NULL)
> + goto drop;
</snip>
> + *(struct pf_addr *)(mtag + 1) = st->rt_addr;
> + m_tag_prepend(m0, mtag);
I'm sure m_tag_prepend() is a good thing to do in case we had
to allocate mtag earlier. But is it a good to prepend mtag which
is in tagset already? I'm just afraid of list corruption.
I would be more comfortable with moving m_tag_prepend() to
`if (mtag == NULL)` branch.
NIT: given drop: goto target is used only when we fail
to allocate mtag we can perhaps move code from drop
target here and do return instead of goto.
</snip>
> #ifdef INET6
> -/* pf_route6() may change pd->m, adjust local copies after calling */
> void
> pf_route6(struct pf_pdesc *pd, struct pf_state *st)
> {
> struct mbuf *m0;
> - struct sockaddr_in6 *dst, sin6;
> - struct rtentry *rt = NULL;
> - struct ip6_hdr *ip6;
> - struct ifnet *ifp = NULL;
> struct m_tag *mtag;
> - unsigned int rtableid;
> +
> +printf("%s\n", __func__);
I think printf() here should be deleted.
>
> if (pd->m->m_pkthdr.pf.routed++ > 3) {
> m_freem(pd->m);
> @@ -6701,101 +6626,37 @@ pf_route6(struct pf_pdesc *pd, struct pf
> }
>
</snip>
> if (st->rt == PF_DUPTO) {
> + mtag = m_tag_find(m0, PACKET_TAG_PF_ROUTE, NULL);
> + if (mtag == NULL) {
> + mtag = m_tag_get(PACKET_TAG_PF_ROUTE, sizeof(st->rt_addr),
> + M_NOWAIT);
> + if (mtag == NULL)
> + goto drop;
> }
</snip>
> + *(struct pf_addr *)(mtag + 1) = st->rt_addr;
> + m_tag_prepend(m0, mtag);
>
same as in pf_route(). I'm concerned with potential
tag-list corruption. Moving call to m_tag_prepend()
to 'if (mtag == NULL)' branch will make more comfortable.
NIT: goto drop: can be replaced in the same way as in
pf_route() earlier.
</snip>
> +drop:
> + if (m0 == pd->m)
> + pd->m = NULL;
> m_freem(m0);
this is in current, but it looks odd now. If we fail to get tag we should
always drop the packet. not just duplicate. the thing is that packets
without tag can not be handled by IP stack as we expect they would
be routed by destination address not the tag (assuming I understand
changes here right). so I would suggest doing something like:
if (m0 != pd->m)
m_freem(m);
m_freem(m0);
pd->m = NULL;
</snip>
rest looks good.
pf_route simplification, or pf_route vs pfsync