From: Martin Pieuchot <mpi@openbsd.org>
Subject: Re: Don't unlock vnode(9) in fifo_read() and fifo_write()
To: Vitaliy Makkoveev <mvs@openbsd.org>, tech@openbsd.org
Date: Thu, 18 Apr 2024 13:07:12 +0200

On 18/04/24(Thu) 11:52, Claudio Jeker wrote:
> On Thu, Apr 18, 2024 at 11:42:09AM +0200, Martin Pieuchot wrote:
> > On 17/04/24(Wed) 18:33, Vitaliy Makkoveev wrote:
> > > On Tue, Apr 16, 2024 at 04:30:08PM +0200, Claudio Jeker wrote:
> > > > On Tue, Apr 16, 2024 at 04:15:05PM +0300, Vitaliy Makkoveev wrote:
> > > > > I suspect syzkaller reported panics [1] and [2] happened because sockets
> > > > > were closed due vnode(9) lock release.
> > 
> > How did you come to this conclusion?
> > 
> > The bug is a NULL dereference:
> > 
> > uvm_fault(0xfffffd806d2c9780, 0x8, 0, 1) -> e
> > kernel: page fault trap, code=0
> > Stopped at      fifo_write+0x4e:        movq    0x8(%rax),%r12
> > 
> > I don't have the same assembly here, but the offset 0x8 seems to correspond
> > the `a_uio' field.   So the argument passed to fifo_write() is NULL.
> > 
> > How can it be NULL?
> 
> As I mentioned I think the problem is that the vnode got recycled and UFS is
> using a now bad vnode for a totally different operation. This is why nothing
> makes sense.

How can it be recycled in the middle of ufs_synlink()?  The vnode should
have a valid reference since vput() is called after vn_rdwr().  So
unlocking the vnode shouldn't be a problem there.  I don't understand
what's happening here.  Somebody wants to look at the reproducer?