Index | Thread | Search

From:
Vitaliy Makkoveev <mvs@openbsd.org>
Subject:
Re: Don't take solock() in soreceive() for tcp(4) sockets
To:
Martin Pieuchot <mpi@openbsd.org>
Cc:
Alexander Bluhm <bluhm@openbsd.org>, tech@openbsd.org
Date:
Sat, 20 Apr 2024 13:33:06 +0300

Download raw body.

Thread

> On 20 Apr 2024, at 12:09, Martin Pieuchot <mpi@openbsd.org> wrote:
> 
> On 19/04/24(Fri) 23:51, Alexander Bluhm wrote:
>> On Wed, Apr 17, 2024 at 06:39:14PM +0300, Vitaliy Makkoveev wrote:
>>> This is the updated diff. Witness kernel passed regress/usr.sbin/syslogd
>>> without issues.
>> 
>> I see a witness error with this diff.  Maybe it was there before
>> and I missed it due to the regular findings.
>> 
>> witness: lock order reversal:
>> 1st 0xf9853a0c vmmaplk (&map->lock)
>> 2nd 0xf93fcd74 sbufrcv (&so->so_rcv.sb_lock)
>> lock order "&so->so_rcv.sb_lock"(rwlock) -> "&map->lock"(rwlock) first seen at:
>> #0  witness_checkorder+0x291
>> #1  rw_enter_read+0x32
>> #2  vm_map_lock_read_ln+0x15
>> #3  uvmfault_lookup+0x90
>> #4  uvm_fault_check+0x14
>> #5  uvm_fault+0xe4
>> #6  kpageflttrap+0xe5
>> #7  trap+0x25f
>> #8  calltrap+0xc
>> #9  copyout+0x42
>> #10 uiomove+0x1c0
>> #11 soreceive+0x841
>> #12 soo_read+0x37
>> #13 dofilereadv+0xb5
>> #14 sys_read+0x49
>> #15 syscall+0x41d
>> #16 Xsyscall_untramp+0xa9
>> lock order data w1 -> w2 missing
> 
> I don't know why WITNESS triggers here when it says "lock order data
> missing".
> 
> This is simply a pagefault occurring while coying data to userland.  I
> cannot imagine a socket buffer lock being taken while holding the vmmap
> lock...  but it wouldn't surprise me if that existed (hello NFS). 
> 
> However I'd suggest you look at uvn_io() which has a special case for
> the NET_LOCK().  If it's possible to do uiomove() w/o holding a lock I'd
> go for it.
> 
> I'd also suggest to do testing with NFS which uses sockets and triggers
> different lock ordering.

This is not NET_LOCK() related. soreceive() and sosend() hold sblock()
for the whole run even while doing uiomove(). No difference is sblock()
rwlock wrapper or something like original implementation, it is still
lock and lock ordering issues are applicable to it.