Index | Thread | Search

From:
Alexander Bluhm <bluhm@openbsd.org>
Subject:
Re: fstat sysctl crashes kernel
To:
Vitaliy Makkoveev <otto@bsdbox.dev>
Cc:
tech@openbsd.org
Date:
Fri, 9 Aug 2024 10:20:41 +0200

Download raw body.

Thread
On Thu, Aug 08, 2024 at 09:27:36PM +0300, Vitaliy Makkoveev wrote:
> > On 8 Aug 2024, at 20:46, Alexander Bluhm <bluhm@openbsd.org> wrote:
> > 
> > On Thu, Aug 08, 2024 at 09:47:27AM -0600, Theo de Raadt wrote:
> >> Todd C. Miller <millert@openbsd.org> wrote:
> >> 
> >>> On Thu, 08 Aug 2024 13:21:54 +0200, Alexander Bluhm wrote:
> >>> 
> >>>> Independent of that, can we commit the chunk below?  When searching
> >>>> for a specific process, there is not need to traverse the list to
> >>>> the end.  Also I check for arg >= 0 as this is cosistent with the
> >>>> arg < -1 check before.  Then you don't have to think about what is
> >>>> special about process 0.  There is no functional difference as
> >>>> process 0 has PS_SYSTEM set and is skipped anyway.
> >>> 
> >>> Yes, this makes sense. OK millert@
> >> 
> >> No matter what we do, we don't get an atomic view.  Punting the problem
> >> to userland sidesteps the difficult problem in the kernel.
> > 
> > For me is important that we commit something instead of discussing
> > pro and con of various diffs.
> > 
> > The idea with the marker sounds like the correct thing, but adds a
> > lot of complexity.  mvs@ already worked on something simmilar.  The
> > easiest fix without locking concerns is my userland fstat diff.
> > Locking the whole process list while doing copyout over NFS may
> > open another can of worms.
> > 
> 
> Userland pages are wired at this moment, you can???t stuck in swap
> on nfs here, sysctl_vslock() cares about that.
> 
> > Both bluhm@ and mvs@ diff survive my stress test for hours.  Usually
> > the test crashes or hangs within minutes.  Currently I am testing
> > with witness to find fallout.
> > 
> 
> Nice to hear. Personal me, I prefer allprocess rwlock(9). It is much
> simpler and left userland as is. Also, is allows to remove kernel
> lock around some process stuff like prfind(). For example, from the 
> filt_procattach() where claudio@ left it to make ps dereference safe.

mvs@ diff has finished witness testing.  It complains already during
boot.

reordering: ld.so libc libcrypto sshd sshd-session ssh-agent.
starting early daemons: syslogd pflogd ntpd.
starting RPC daemons:.
witness: lock order reversal:
 1st 0xd7af9474 inode (&ip->i_lock)
 2nd 0xd0f32d68 allpslk (&allprocess_lock)
lock order [1] inode (&ip->i_lock) -> [2] allpslk (&allprocess_lock)
lock order data 0xd0ca8820 -> 0xd0cf51b4 is missing
lock order [2] allpslk (&allprocess_lock) -> [1] inode (&ip->i_lock)
#0  witness_checkorder+0x291
#1  rw_enter+0x4d
#2  rrw_enter+0x41
#3  ufs_lock+0x27
#4  VOP_LOCK+0x50
#5  vn_lock+0x90
#6  vrele+0x35
#7  unveil_destroy+0x76
#8  exit1+0x378
#9  sys_exit+0x14
#10 syscall+0x3f2
#11 Xsyscall_untramp+0xa9
savecore: no core dump
checking quotas: done.
clearing /tmp
kern.securelevel: 0 -> 1
turning on accounting
witness: lock order reversal:
 1st 0xd0f32d68 allpslk (&allprocess_lock)
 2nd 0xd0ed1114 acctlk (acctlk)
lock order [1] allpslk (&allprocess_lock) -> [2] acctlk (acctlk)
#0  witness_checkorder+0x291
#1  rw_enter_read+0x32
#2  acct_process+0x38
#3  exit1+0x35d
#4  sys_exit+0x14
#5  syscall+0x3f2
#6  Xsyscall_untramp+0xa9
lock order [2] acctlk (acctlk) -> [1] allpslk (&allprocess_lock)
#0  witness_checkorder+0x291
#1  rw_enter_write+0x30
#2  fork1+0x138
#3  kthread_create+0x32
#4  sys_acct+0xfe
#5  syscall+0x3f2
#6  Xsyscall_untramp+0xa9
creating runtime link editor directory cache.
preserving editor files.
starting network daemons: sshd snmpd smtpd sndiod.