Index | Thread | Search

From:
Claudio Jeker <cjeker@diehard.n-r-g.com>
Subject:
Re: wireguard - prepend mbuf allocations
To:
David Hill <dhill@mindcry.org>
Cc:
tech@openbsd.org
Date:
Fri, 27 Sep 2024 16:36:56 +0200

Download raw body.

Thread
On Fri, Sep 27, 2024 at 09:48:53AM -0400, David Hill wrote:
> Hi -
> 
> Using the new netstat -m that shows:
> 4510 prepend mbuf allocation
> 0 pullup mbuf allocation
> 1 pullup memory copy
> 0 pulldown mbuf allocation
> 0 pulldown memory copy
> 
> 
> I noticed about 5 million prepend mbuf allocations per day.  bluhm@ gave me
> some code to find the problem users.  I noticed wireguard had a bunch:
> 
> m_prepend: len 28
> -+- mbuf 0xfffffd800a0b4000, dat, off 0, len 96, pktlen 96, clsize 2048
>   \\- total chain 1, len 96, size 2048
> Starting stack trace...
> m_print_debug(fffffd800a0b4000,ffffffff819e8c81,1c) at m_print_debug+0x31d
> m_prepend(fffffd800a0b4000,1c,2) at m_prepend+0xbe
> udp_output(fffffd8211dd7a88,fffffd800a0b4000,ffff80003156da90,fffffd800a0b9600)
> at udp_output+0x2f2
> sosend(ffff800000472018,ffff80003156da90,0,fffffd800a0b4000,fffffd800a0b9600,0)
> at sosend+0x3e2
> wg_send(ffff8000003fe000,ffff80003156dc28,fffffd800a0b4000) at wg_send+0x131
> wg_deliver_out(ffff800016e46ba8) at wg_deliver_out+0x19b
> taskq_thread(ffff80000003d200) at taskq_thread+0x16c
> end trace frame: 0x0, count: 250
> End of stack trace.
> 
> 
> Starting stack trace...
> m_prepend(fffffd801ce2fb00,1c,2) at m_prepend+0x85
> udp_output(fffffd82126f83f0,fffffd801ce2fb00,ffff8000316591e0,0) at
> udp_output+0x2f2
> sosend(ffff80000047d238,ffff8000316591e0,0,fffffd801ce2fb00,0,0) at
> sosend+0x3e2
> wg_send(ffff800000408000,ffff8000316593a8,fffffd801ce2fb00) at wg_send+0x131
> wg_send_buf(ffff800000408000,ffff8000316593a8,ffff800031659438,94) at
> wg_send_buf+0x81
> wg_peer_send_buf(ffff8000008a96b8,ffff800031659438,94) at
> wg_peer_send_buf+0x14a
> wg_send_initiation(ffff8000008a96b8) at wg_send_initiation+0x172
> taskq_thread(ffff8000003c4d00) at taskq_thread+0x16c
> end trace frame: 0x0, count: 249
> End of stack trace.
> 
> This diff prevents the prepend mbuf allocations by reserving max_hdr space.
> Does this look correct?
> 
> Index: net/if_wg.c
> ===================================================================
> RCS file: /cvs/src/sys/net/if_wg.c,v
> retrieving revision 1.38
> diff -u -p -r1.38 if_wg.c
> --- net/if_wg.c	9 Apr 2024 12:53:08 -0000	1.38
> +++ net/if_wg.c	27 Sep 2024 13:36:37 -0000
> @@ -864,6 +864,7 @@ wg_send_buf(struct wg_softc *sc, struct
>  retry:
>  	m = m_gethdr(M_WAIT, MT_DATA);
>  	m->m_len = 0;
> +	m_align(m, len <= MHLEN ? len : max_hdr);
>  	m_copyback(m, 0, len, buf, M_WAIT);

The use of m_copyback() here is very questionable. This should allocate
based on len and then use m_align(). Right now you create a mbuf chain
with a mbuf header followed by whatever m_copyback() comes up with.

In other words this should behave more like what seems to be going on in
wg_encap based on the diff below.

>  	/* As we're sending a handshake packet here, we want high priority */
> @@ -1510,7 +1511,8 @@ wg_encap(struct wg_softc *sc, struct mbu
> 
>  	plaintext_len = min(WG_PKT_WITH_PADDING(m->m_pkthdr.len), t->t_mtu);
>  	padding_len = plaintext_len - m->m_pkthdr.len;
> -	out_len = sizeof(struct wg_pkt_data) + plaintext_len + NOISE_AUTHTAG_LEN;
> +	out_len = max_hdr + sizeof(struct wg_pkt_data) + plaintext_len +
> +	    NOISE_AUTHTAG_LEN;
> 
>  	/*
>  	 * For the time being we allocate a new packet with sufficient size to
> @@ -1525,6 +1527,9 @@ wg_encap(struct wg_softc *sc, struct mbu
>  	if ((mc = m_clget(NULL, M_NOWAIT, out_len)) == NULL)
>  		goto error;
> 

I'm pretty sure you want to use m_align here so that you end up with as
much free headroom in mc as possible. Doing that will remove the need to
fiddle with out_len above.

> +	mc->m_pkthdr.len = mc->m_len = out_len;
> +	m_adj(mc, max_hdr);
> +	
>  	data = mtod(mc, struct wg_pkt_data *);
>  	m_copydata(m, 0, m->m_pkthdr.len, data->buf);
>  	bzero(data->buf + m->m_pkthdr.len, padding_len);
> @@ -1560,9 +1565,6 @@ wg_encap(struct wg_softc *sc, struct mbu
> 
>  	mc->m_pkthdr.ph_loopcnt = m->m_pkthdr.ph_loopcnt;
>  	mc->m_flags &= ~(M_MCAST | M_BCAST);
> -	mc->m_len = out_len;
> -	m_calchdrlen(mc);
> -
>  	/*
>  	 * We would count ifc_opackets, ifc_obytes of m here, except if_snd
>  	 * already does that for us, so no need to worry about it.
> 

-- 
:wq Claudio