Mailing List Archive

On Thu, Jan 29, 2026 at 10:53:17AM +1000, David Gwynne wrote: > mbufs are special for lots of reasons, but one is that the total > amount of memory that mbufs can be allocated out of is limited by > mbuf_mem_limit. all the mbuf and cluster pools are subject to that > limit, which is enforced by having these pools use a custom pool > page allocator that checks that limit and accounts for their use > of it. > > the problem is the pools don't coordinate with each other. when > mbuf_mem_limit is hit, it's possible for a sleeping allocation to > wait on memory in one pool, but when memory is released by another > pool that first one doesn't know about it, and doesn't get woken > up to try and allocate pages that are now free in the backend page > allocator. > > the simple fix for this is to wakeup the mbuf pools when pages are > returned to the backend mbuf page allocator. if any of the pools have > pending allocation requests, they are moved forward by the wakeup. > > this means if a system does hit the mbuf mem limit and a lot of > procs/threads get stuck sleeping on mbuf allocations, there's a > better chance they can now be pushed forward if another mbuf pool > backs off and gives memory back to the system. > > the wakeups are deferred to a task running in the systqmp taskq. > this is the same taskq that the pool gc ops run in. if multiple > mbuf pools have gced pages released, this debounces the wakeup calls > so they only happen once per pool gc run. > > i could avoid the wakeup calls by only scheduling the task when the > current mbuf_mem_alloc value is close to mbuf_mem_limit, but the pool gc > process is the extremely slow path anyway. the ratio of pool_put > operations to m_pool_free ops is many millions to one. > > im going to commit this in the next day or two unless there are > objections. oks are welcome too. OK claudio@ Is the delay introduced deboucing via the task a problem? Once the memory is freed some other pool is able to grab it before the wakeup makes it through. Also isn't m_pool_free() already running on the pool gc (and so systqmp) taskq? > Index: uipc_mbuf.c > =================================================================== > RCS file: /cvs/src/sys/kern/uipc_mbuf.c,v > diff -u -p -r1.302 uipc_mbuf.c > --- uipc_mbuf.c 6 Aug 2025 14:00:33 -0000 1.302 > +++ uipc_mbuf.c 29 Jan 2026 00:32:28 -0000 > @@ -81,6 +81,7 @@ > #include <sys/pool.h> > #include <sys/percpu.h> > #include <sys/sysctl.h> > +#include <sys/task.h> > > #include <sys/socket.h> > #include <net/if.h> > @@ -131,6 +132,9 @@ void m_zero(struct mbuf *); > unsigned long mbuf_mem_limit; /* [a] how much memory can be allocated */ > unsigned long mbuf_mem_alloc; /* [a] how much memory has been allocated */ > > +void m_pool_wakeup(void *); > +struct task mbuf_mem_wakeup = TASK_INITIALIZER(m_pool_wakeup, NULL); > + > void *m_pool_alloc(struct pool *, int, int *); > void m_pool_free(struct pool *, void *); > > @@ -212,17 +216,13 @@ mbcpuinit(void) > int > nmbclust_update(long newval) > { > - int i; > - > if (newval <= 0 || newval > LONG_MAX / MCLBYTES) > return ERANGE; > /* update the global mbuf memory limit */ > atomic_store_long(&nmbclust, newval); > atomic_store_long(&mbuf_mem_limit, newval * MCLBYTES); > > - pool_wakeup(&mbpool); > - for (i = 0; i < nitems(mclsizes); i++) > - pool_wakeup(&mclpools[i]); > + task_add(systqmp, &mbuf_mem_wakeup); > > return 0; > } > @@ -1471,6 +1471,18 @@ m_pool_free(struct pool *pp, void *v) > (*pool_allocator_multi.pa_free)(pp, v); > > atomic_sub_long(&mbuf_mem_alloc, pp->pr_pgsize); > + > + task_add(systqmp, &mbuf_mem_wakeup); > +} > + > +void > +m_pool_wakeup(void *null) > +{ > + int i; > + > + pool_wakeup(&mbpool); > + for (i = 0; i < nitems(mclsizes); i++) > + pool_wakeup(&mclpools[i]); > } > > void > -- :wq Claudio

2026-01-29 00:53 David Gwynne:
wake up mbuf pools when pages get released
- 2026-01-29 09:35 Claudio Jeker:
  wake up mbuf pools when pages get released
- - 2026-01-29 11:10 David Gwynne:
    wake up mbuf pools when pages get released
  - - 2026-01-29 12:45 Claudio Jeker:
      wake up mbuf pools when pages get released