From: David Gwynne Subject: wake up mbuf pools when pages get released To: tech@openbsd.org Date: Thu, 29 Jan 2026 10:53:17 +1000 mbufs are special for lots of reasons, but one is that the total amount of memory that mbufs can be allocated out of is limited by mbuf_mem_limit. all the mbuf and cluster pools are subject to that limit, which is enforced by having these pools use a custom pool page allocator that checks that limit and accounts for their use of it. the problem is the pools don't coordinate with each other. when mbuf_mem_limit is hit, it's possible for a sleeping allocation to wait on memory in one pool, but when memory is released by another pool that first one doesn't know about it, and doesn't get woken up to try and allocate pages that are now free in the backend page allocator. the simple fix for this is to wakeup the mbuf pools when pages are returned to the backend mbuf page allocator. if any of the pools have pending allocation requests, they are moved forward by the wakeup. this means if a system does hit the mbuf mem limit and a lot of procs/threads get stuck sleeping on mbuf allocations, there's a better chance they can now be pushed forward if another mbuf pool backs off and gives memory back to the system. the wakeups are deferred to a task running in the systqmp taskq. this is the same taskq that the pool gc ops run in. if multiple mbuf pools have gced pages released, this debounces the wakeup calls so they only happen once per pool gc run. i could avoid the wakeup calls by only scheduling the task when the current mbuf_mem_alloc value is close to mbuf_mem_limit, but the pool gc process is the extremely slow path anyway. the ratio of pool_put operations to m_pool_free ops is many millions to one. im going to commit this in the next day or two unless there are objections. oks are welcome too. Index: uipc_mbuf.c =================================================================== RCS file: /cvs/src/sys/kern/uipc_mbuf.c,v diff -u -p -r1.302 uipc_mbuf.c --- uipc_mbuf.c 6 Aug 2025 14:00:33 -0000 1.302 +++ uipc_mbuf.c 29 Jan 2026 00:32:28 -0000 @@ -81,6 +81,7 @@ #include #include #include +#include #include #include @@ -131,6 +132,9 @@ void m_zero(struct mbuf *); unsigned long mbuf_mem_limit; /* [a] how much memory can be allocated */ unsigned long mbuf_mem_alloc; /* [a] how much memory has been allocated */ +void m_pool_wakeup(void *); +struct task mbuf_mem_wakeup = TASK_INITIALIZER(m_pool_wakeup, NULL); + void *m_pool_alloc(struct pool *, int, int *); void m_pool_free(struct pool *, void *); @@ -212,17 +216,13 @@ mbcpuinit(void) int nmbclust_update(long newval) { - int i; - if (newval <= 0 || newval > LONG_MAX / MCLBYTES) return ERANGE; /* update the global mbuf memory limit */ atomic_store_long(&nmbclust, newval); atomic_store_long(&mbuf_mem_limit, newval * MCLBYTES); - pool_wakeup(&mbpool); - for (i = 0; i < nitems(mclsizes); i++) - pool_wakeup(&mclpools[i]); + task_add(systqmp, &mbuf_mem_wakeup); return 0; } @@ -1471,6 +1471,18 @@ m_pool_free(struct pool *pp, void *v) (*pool_allocator_multi.pa_free)(pp, v); atomic_sub_long(&mbuf_mem_alloc, pp->pr_pgsize); + + task_add(systqmp, &mbuf_mem_wakeup); +} + +void +m_pool_wakeup(void *null) +{ + int i; + + pool_wakeup(&mbpool); + for (i = 0; i < nitems(mclsizes); i++) + pool_wakeup(&mclpools[i]); } void