Index | Thread | Search

From:
Claudio Jeker <cjeker@diehard.n-r-g.com>
Subject:
Re: Don't unlock vnode(9) in fifo_read() and fifo_write()
To:
Vitaliy Makkoveev <mvs@openbsd.org>
Cc:
tech@openbsd.org
Date:
Thu, 18 Apr 2024 12:20:14 +0200

Download raw body.

Thread
  • Vitaliy Makkoveev:

    Don't unlock vnode(9) in fifo_read() and fifo_write()

  • Claudio Jeker:

    Don't unlock vnode(9) in fifo_read() and fifo_write()

  • On Wed, Apr 17, 2024 at 06:33:06PM +0300, Vitaliy Makkoveev wrote:
    > On Tue, Apr 16, 2024 at 04:30:08PM +0200, Claudio Jeker wrote:
    > > On Tue, Apr 16, 2024 at 04:15:05PM +0300, Vitaliy Makkoveev wrote:
    > > > I suspect syzkaller reported panics [1] and [2] happened because sockets
    > > > were closed due vnode(9) lock release.
    > > > 
    > > > The solock() designed to be take after vnode(9) lock, and unix(4)
    > > > sockets perform solock() release for that purpose. No reason to release
    > > > vnode(9) lock in fifo_read() and fifo_write() around soreceive() and
    > > > sosend() respectively.
    > > > 
    > > > I can't reproduce syzkaller panics, so I can't say the issues were
    > > > fixed, but the unnecessary re-locking is the bad thing in any case.
    > > > 
    > > > I performed regress/sys/fifofs/ and regress/sys/kern/unixsockets/ test
    > > > without witness(4) warnings.
    > > > 
    > > > 1. https://syzkaller.appspot.com/bug?extid=5779bc64fc4fdd0a5140
    > > > 2. https://syzkaller.appspot.com/bug?extid=b7ef44521c90d0740ba3
    > > 
    > > I'm not sure if this diff is correct. I doubt it will fix this issue.
    > > 
    > > The problem here at hand is that a vnode gets repurposed while active.
    > > That happens outside of the fifo code since e.g. fifo_read is called via
    > > function pointer.
    > > 
    > > The call in [1] starts as a ufs_symlink() call but in the end tries to
    > > write to a FIFO. That makes no sense. The ufs_symlink should call write on
    > > a vnode that points into ufs not to a FIFO.
    > > 
    > > In [2] the issue is similar, a read of a vnd block ends up on a FIFO.
    > > Again the file that vnd is backed with is most probably on UFS.
    > > 
    > > This VOP_UNLOCK() and then vn_lock() dance is also done by specfs for char
    > > devices. In my opinion this looks more like a missing vref() which allows
    > > the system to recycle a vnode where it should not.
    > > VOP_UNLOCK() and vn_lock() are more about concurrent access to a file /
    > > block.  The idea is that you can have multiple reader or writers since the
    > > writes are atomic (seems a bit strange on FIFOs where a write may not be
    > > atomic) but makes sense for things like /dev/tun or /dev/bpf.
    > > 
    > 
    > What do you mean? Socket can have only one reader and one writer, the
    > rests will sleep in solock() or sblock() acquisition. The same behaviour
    > should be with vnode lock acquisition.
    
    The vnode lock is there to prevent concurrent access to files. Devices
    normally behave differently and don't need such a lock (specfs and fifofs do
    not even implement vop_lock which is extra confusing).
    
    The thing is that if you hold an exclusive lock and then have soreceive
    sleep. It is not possible to write to the same vnode to wakeup the
    soreceive. The write will block on the vnode lock waiting for the read to
    be over and so you deadlock.
    
    I always get lost in VFS and how specfs and fifofs are hooked into the
    other filesystems. So I'm not sure if those VOP_UNLOCK() and vn_lock()
    calls do anything or not.
    
    > > I guess you need to reduce the number of vnodes in your system to trigger
    > > aggressive vnode reuse to tickle this bug.
    > >  
    
    -- 
    :wq Claudio
    
    
    
  • Vitaliy Makkoveev:

    Don't unlock vnode(9) in fifo_read() and fifo_write()

  • Claudio Jeker:

    Don't unlock vnode(9) in fifo_read() and fifo_write()