Download raw body.
mbuf cluster m_extref_mtx contention
On Mon, Mar 16, 2026 at 04:55:49PM +0100, Alexander Bluhm wrote: > On Sat, Mar 14, 2026 at 10:25:33AM +1000, David Gwynne wrote: > > On Wed, Feb 25, 2026 at 11:02:59PM +0100, Alexander Bluhm wrote: > > > Hi David, > > > > > > I habe tested your diff again and throughput in my splicing test > > > goes from 35 GBit/sec to 65 GBit/sec. Could you please commit it, > > > so I can work on the next bottleneck. > > > > > > OK bluhm@ > > > > im much more comfortable with the proxy version instead of the > > refcount on a leader version. > > > > the leader version adds a constraint where the leader can't detach > > a cluster until all references are dropped. the consequence of this > > is immediately demonstrated in m_defrag, and it required auditing > > the rest of the kernel to make sure that nowhere else removed or > > replaced a cluster on an mbuf. it feels like a big footgun that's > > going to hurt someone in the future. > > > > delaying the actual free of the leader mbuf is ugly too. > > > > i'd prefer to give up a little bit of performance in exchange for > > less dangerous and ugly code. > > I see it differently. Additional memory allocations make it harder > to understand what is going on in the whole system. And they may > be slower. The extra allocation should not make it harder IMO. It actually simplifies a few things since mbufs are not suddenly retained after their lifetime. > The special hacks in m_defrag() that swap the cluster at an existig > mbuf look ugly to me. I was happy when I saw your diff that just > chained the new mbuf. m_defrag was built with the promise to not modify the pointer to the mbuf and return a flattened buffer. This is no longer true with dlg's initial diff. The first mbuf is now just an empyt shell and we need to hope that all consumers handle this correctly. E.g. the usage of mtod() is very dangerous on such a buffer. While ugly I prefer that m_defrag() keeps on doing the magic replace. > Part of the mbuf design is to use the struct mbuf_ext for cluster > management. Doing the refcounting there would be the natural thing > in the mbuf world. And only free of the small mbuf, not the cluster > is delayed. The problem is that clusters are loosly couppled and we need to keep references somewhere. Abusing a mbuf turns that mbuf into something special which is not great, also when you m_copym will it put the reference into the right mbuf? > We need a third oppinion. Claudio, what do you think? In short I think I'm more on dlg's side and think the proxy version is easier to argue about. -- :wq Claudio
mbuf cluster m_extref_mtx contention