Index | Thread | Search

From:
Kirill A. Korinsky <kirill@korins.ky>
Subject:
Re: sys/ffs: reclaim vnode before dropping last ref
To:
tech@openbsd.org
Date:
Wed, 24 Jun 2026 12:04:32 +0200

Download raw body.

Thread
On Wed, 24 Jun 2026 11:53:53 +0200,
Jeremie Courreges-Anglas <jca@wxcvbn.org> wrote:
> 
> On Wed, Jun 24, 2026 at 03:35:08AM -0600, Theo de Raadt wrote:
> > This vrele/vgone error is safe if we want to keep everything biglock.....
> 
> For people not in the hackroom, right now a kernel with kirill's diff
> breaks:
> 
> root on sd0a (6cd23ae55f48f660.a) swap on sd0b dump on sd0b
> uvm_fault(0xffffef843fa328a0, 0x18, 0, 1) -> e
> kernel: page fault trap, code=0
> Stopped at      spec_close+0x35:        movl    0x18(%rax),%eax
>     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> *362332  18108      0         0x3          0    0K swapctl
> spec_close(ffff8000e608aa98) at spec_close+0x35
> VOP_CLOSE(ffffef83cbf32e60,4,ffffffffffffffff,ffff8000e604d238) at VOP_CLOSE+0x
> 59
> vclean(ffffef83cbf32e60,8,ffff8000e604d238) at vclean+0x1c1
> vgonel(ffffef83cbf32e60,ffff8000e604d238) at vgonel+0x63
> ffs_vinit(ffff8000002f2800,ffff8000e608ac00) at ffs_vinit+0xc6
> ffs_vget(ffff8000002f2800,1309a,ffff8000e608acc8) at ffs_vget+0x31f
> ufs_lookup() at ufs_lookup+0xafd
> VOP_LOOKUP(ffffef811f0a1510,ffff8000e608b2d0,ffff8000e608b300) at VOP_LOOKUP+0x
> 3f
> vfs_lookup(ffff8000e608b2a0) at vfs_lookup+0x39b
> namei(ffff8000e608b2a0) at namei+0x475
> sys_swapctl(ffff8000e604d238,ffff8000e608b470,ffff8000e608b3f0) at sys_swapctl+
> 0x1db
> syscall(ffff8000e608b470) at syscall+0x5d5
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x75f3ac954920, count: 2
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{0}> bo re
> rebooting...
> 

Yeah, it has the second part which I'm still thinking in a way that I broke.

It boots, but...

So.

Several checkalias() callers discard the vnode passed to checkalias() after
moving its filesystem node to the returned alias. In the uncommon block
device alias case, checkalias() leaves that discarded vnode as a shell: it
uses spec_vops, but has no specinfo.

If vgonel() reclaims such a vnode while the caller still holds the last
reference, vclean(DOCLOSE) treats it as active and dispatches VOP_CLOSE() to
spec_close(), which expects v_specinfo to be present and dereferences
v_rdev.

Such shell vnodes do not represent an opened device instance, so there is
nothing to close through spec_vops. Skip DOCLOSE for spec_vops vnodes
without specinfo.

And here the diff:

Index: sys/kern/vfs_subr.c
===================================================================
RCS file: /home/cvs/src/sys/kern/vfs_subr.c,v
diff -u -p -r1.334 vfs_subr.c
--- sys/kern/vfs_subr.c	22 Jun 2026 17:35:50 -0000	1.334
+++ sys/kern/vfs_subr.c	24 Jun 2026 09:36:03 -0000
@@ -1168,7 +1168,7 @@ vgonel(struct vnode *vp, struct proc *p)
 {
 	struct vnode *vq;
 	struct vnode *vx;
-	int s;
+	int s, flags = DOCLOSE;
 
 	KASSERT(vp->v_uvcount == 0);
 
@@ -1188,7 +1188,9 @@ vgonel(struct vnode *vp, struct proc *p)
 	/*
 	 * Clean out the filesystem specific data.
 	 */
-	vclean(vp, DOCLOSE, p);
+	if (vp->v_op == &spec_vops && vp->v_specinfo == NULL)
+		flags = 0;
+	vclean(vp, flags, p);
 	/*
 	 * Delete from old mount point vnode list, if on one.
 	 */


-- 
wbr, Kirill