Index | Thread | Search

From:
Robert <rmz@hostexpert.pl>
Subject:
vfs: cap maxvnodes autogrow from bcstats.numbufs
To:
tech@openbsd.org
Date:
Wed, 27 May 2026 17:14:02 +0200

Download raw body.

Thread
  • Robert:

    vfs: cap maxvnodes autogrow from bcstats.numbufs

Hello,

I am seeing a serious performance issue on OpenBSD on a hosting server 
with many files and 128 GB RAM.

After running a large backup scan, for example with restic or rsync, the 
kernel cache grows very large. That alone would not be a problem, but 
after such a scan normal file access becomes much slower.

This is especially visible with PHP CMS workloads, where applications 
perform more filesystem I/O and touch many files during a single 
request. Simple websites load about 3-4 times slower, while larger PHP 
CMS-based sites can become tens of times slower after the backup scan.

The server uses fast NVMe storage. In this workload, a very large 
vnode/buffer cache appears to hurt performance more than it helps.

I traced the issue to sys/kern/vfs_subr.c, in getnewvnode():

```
maxvnodes = maxvnodes < bcstats.numbufs ? bcstats.numbufs
     : maxvnodes;
```

Because of this, maxvnodes can grow to match bcstats.numbufs and is 
never reduced afterwards. After a large backup scan this results in a 
very large vnode limit, and the system keeps a huge amount of 
vnode/buffer cache state.

As a local test, I disabled this automatic maxvnodes growth. With this 
change the kernel respects the configured kern.maxvnodes behavior much 
better. In my case kern.maxvnodes is 5926 and kern.numvnodes stays 
around 11854, which matches the expected 2x behavior from vntblinit().

After applying this patch, the slowdown disappears on my workload. PHP 
CMS sites return to normal response times even after large restic/rsync 
backup scans.

My local test patch is:

# Index: sys/kern/vfs_subr.c

RCS file: /cvs/src/sys/kern/vfs_subr.c,v
retrieving revision 1.319
diff -u -p -u -r1.319 vfs_subr.c
--- sys/kern/vfs_subr.c	3 Feb 2024 18:51:58 -0000	1.319
+++ sys/kern/vfs_subr.c	11 Feb 2025 01:52:47 -0000
@@ -379,8 +379,8 @@ getnewvnode(enum vtagtype tag, struct mo
* allow maxvnodes to increase if the buffer cache itself
* is big enough to justify it. (we don't shrink it ever)
*/

* maxvnodes = maxvnodes < bcstats.numbufs ? bcstats.numbufs
* ```
     : maxvnodes;
   ```

+//	maxvnodes = maxvnodes < bcstats.numbufs ? bcstats.numbufs
+//	    : maxvnodes;

```
/*
  * We must choose whether to allocate a new vnode or recycle an
```

I also checked vfs_subr.c revision 1.333. It moves UVM vnode allocation 
out of getnewvnode(), which reduces memory waste per vnode, but it does 
not address this issue. The maxvnodes autogrow based on bcstats.numbufs 
is still present, so maxvnodes can still grow after a large backup scan 
and never shrink afterwards.

I do not claim that simply removing this code is the best final fix. It 
is only a local workaround that clearly improves this workload. Maybe a 
better solution would be to limit this autogrow, make it shrinkable, or 
expose a tunable to control the maximum automatic vnode growth caused by 
buffer cache size.

I can provide more details, measurements, sysctl output, or test 
alternative patches if needed.

System details:

* OpenBSD version:
* Architecture: amd64
* RAM: 128 GB
* Storage: NVMe
* Workload: hosting server, many small files, many PHP CMS installations
* Backup tools tested: restic, rsync
* kern.maxvnodes: 5926
* kern.numvnodes after patch: about 11854

Best regards,
Robert