From: Vitaliy Makkoveev <mvs@openbsd.org>
Subject: Re: Don't unlock vnode(9) in fifo_read() and fifo_write()
To: tech@openbsd.org
Date: Thu, 18 Apr 2024 13:34:38 +0300

On Thu, Apr 18, 2024 at 12:20:14PM +0200, Claudio Jeker wrote:
> On Wed, Apr 17, 2024 at 06:33:06PM +0300, Vitaliy Makkoveev wrote:
> > On Tue, Apr 16, 2024 at 04:30:08PM +0200, Claudio Jeker wrote:
> > > On Tue, Apr 16, 2024 at 04:15:05PM +0300, Vitaliy Makkoveev wrote:
> > > > I suspect syzkaller reported panics [1] and [2] happened because sockets
> > > > were closed due vnode(9) lock release.
> > > > 
> > > > The solock() designed to be take after vnode(9) lock, and unix(4)
> > > > sockets perform solock() release for that purpose. No reason to release
> > > > vnode(9) lock in fifo_read() and fifo_write() around soreceive() and
> > > > sosend() respectively.
> > > > 
> > > > I can't reproduce syzkaller panics, so I can't say the issues were
> > > > fixed, but the unnecessary re-locking is the bad thing in any case.
> > > > 
> > > > I performed regress/sys/fifofs/ and regress/sys/kern/unixsockets/ test
> > > > without witness(4) warnings.
> > > > 
> > > > 1. https://syzkaller.appspot.com/bug?extid=5779bc64fc4fdd0a5140
> > > > 2. https://syzkaller.appspot.com/bug?extid=b7ef44521c90d0740ba3
> > > 
> > > I'm not sure if this diff is correct. I doubt it will fix this issue.
> > > 
> > > The problem here at hand is that a vnode gets repurposed while active.
> > > That happens outside of the fifo code since e.g. fifo_read is called via
> > > function pointer.
> > > 
> > > The call in [1] starts as a ufs_symlink() call but in the end tries to
> > > write to a FIFO. That makes no sense. The ufs_symlink should call write on
> > > a vnode that points into ufs not to a FIFO.
> > > 
> > > In [2] the issue is similar, a read of a vnd block ends up on a FIFO.
> > > Again the file that vnd is backed with is most probably on UFS.
> > > 
> > > This VOP_UNLOCK() and then vn_lock() dance is also done by specfs for char
> > > devices. In my opinion this looks more like a missing vref() which allows
> > > the system to recycle a vnode where it should not.
> > > VOP_UNLOCK() and vn_lock() are more about concurrent access to a file /
> > > block.  The idea is that you can have multiple reader or writers since the
> > > writes are atomic (seems a bit strange on FIFOs where a write may not be
> > > atomic) but makes sense for things like /dev/tun or /dev/bpf.
> > > 
> > 
> > What do you mean? Socket can have only one reader and one writer, the
> > rests will sleep in solock() or sblock() acquisition. The same behaviour
> > should be with vnode lock acquisition.
> 
> The vnode lock is there to prevent concurrent access to files. Devices
> normally behave differently and don't need such a lock (specfs and fifofs do
> not even implement vop_lock which is extra confusing).
> 
> The thing is that if you hold an exclusive lock and then have soreceive
> sleep. It is not possible to write to the same vnode to wakeup the
> soreceive. The write will block on the vnode lock waiting for the read to
> be over and so you deadlock.
> 

Missing that. Thanks for explanation.

> I always get lost in VFS and how specfs and fifofs are hooked into the
> other filesystems. So I'm not sure if those VOP_UNLOCK() and vn_lock()
> calls do anything or not.
> 
> > > I guess you need to reduce the number of vnodes in your system to trigger
> > > aggressive vnode reuse to tickle this bug.
> > >  
> 
> -- 
> :wq Claudio