Index | Thread | Search

From:
Mark Kettenis <mark.kettenis@xs4all.nl>
Subject:
Re: fstat sysctl crashes kernel
To:
Alexander Bluhm <bluhm@openbsd.org>
Cc:
otto@bsdbox.dev, tech@openbsd.org
Date:
Fri, 09 Aug 2024 12:37:25 +0200

Download raw body.

Thread
> Date: Fri, 9 Aug 2024 10:20:41 +0200
> From: Alexander Bluhm <bluhm@openbsd.org>
> 
> On Thu, Aug 08, 2024 at 09:27:36PM +0300, Vitaliy Makkoveev wrote:
> > > On 8 Aug 2024, at 20:46, Alexander Bluhm <bluhm@openbsd.org> wrote:
> > > 
> > > On Thu, Aug 08, 2024 at 09:47:27AM -0600, Theo de Raadt wrote:
> > >> Todd C. Miller <millert@openbsd.org> wrote:
> > >> 
> > >>> On Thu, 08 Aug 2024 13:21:54 +0200, Alexander Bluhm wrote:
> > >>> 
> > >>>> Independent of that, can we commit the chunk below?  When searching
> > >>>> for a specific process, there is not need to traverse the list to
> > >>>> the end.  Also I check for arg >= 0 as this is cosistent with the
> > >>>> arg < -1 check before.  Then you don't have to think about what is
> > >>>> special about process 0.  There is no functional difference as
> > >>>> process 0 has PS_SYSTEM set and is skipped anyway.
> > >>> 
> > >>> Yes, this makes sense. OK millert@
> > >> 
> > >> No matter what we do, we don't get an atomic view.  Punting the problem
> > >> to userland sidesteps the difficult problem in the kernel.
> > > 
> > > For me is important that we commit something instead of discussing
> > > pro and con of various diffs.
> > > 
> > > The idea with the marker sounds like the correct thing, but adds a
> > > lot of complexity.  mvs@ already worked on something simmilar.  The
> > > easiest fix without locking concerns is my userland fstat diff.
> > > Locking the whole process list while doing copyout over NFS may
> > > open another can of worms.
> > > 
> > 
> > Userland pages are wired at this moment, you can???t stuck in swap
> > on nfs here, sysctl_vslock() cares about that.
> > 
> > > Both bluhm@ and mvs@ diff survive my stress test for hours.  Usually
> > > the test crashes or hangs within minutes.  Currently I am testing
> > > with witness to find fallout.
> > > 
> > 
> > Nice to hear. Personal me, I prefer allprocess rwlock(9). It is much
> > simpler and left userland as is. Also, is allows to remove kernel
> > lock around some process stuff like prfind(). For example, from the 
> > filt_procattach() where claudio@ left it to make ps dereference safe.
> 
> mvs@ diff has finished witness testing.  It complains already during
> boot.
> 
> reordering: ld.so libc libcrypto sshd sshd-session ssh-agent.
> starting early daemons: syslogd pflogd ntpd.
> starting RPC daemons:.
> witness: lock order reversal:
>  1st 0xd7af9474 inode (&ip->i_lock)
>  2nd 0xd0f32d68 allpslk (&allprocess_lock)
> lock order [1] inode (&ip->i_lock) -> [2] allpslk (&allprocess_lock)
> lock order data 0xd0ca8820 -> 0xd0cf51b4 is missing
> lock order [2] allpslk (&allprocess_lock) -> [1] inode (&ip->i_lock)
> #0  witness_checkorder+0x291
> #1  rw_enter+0x4d
> #2  rrw_enter+0x41
> #3  ufs_lock+0x27
> #4  VOP_LOCK+0x50
> #5  vn_lock+0x90
> #6  vrele+0x35
> #7  unveil_destroy+0x76
> #8  exit1+0x378
> #9  sys_exit+0x14
> #10 syscall+0x3f2
> #11 Xsyscall_untramp+0xa9
> savecore: no core dump
> checking quotas: done.
> clearing /tmp
> kern.securelevel: 0 -> 1
> turning on accounting
> witness: lock order reversal:
>  1st 0xd0f32d68 allpslk (&allprocess_lock)
>  2nd 0xd0ed1114 acctlk (acctlk)
> lock order [1] allpslk (&allprocess_lock) -> [2] acctlk (acctlk)
> #0  witness_checkorder+0x291
> #1  rw_enter_read+0x32
> #2  acct_process+0x38
> #3  exit1+0x35d
> #4  sys_exit+0x14
> #5  syscall+0x3f2
> #6  Xsyscall_untramp+0xa9
> lock order [2] acctlk (acctlk) -> [1] allpslk (&allprocess_lock)
> #0  witness_checkorder+0x291
> #1  rw_enter_write+0x30
> #2  fork1+0x138
> #3  kthread_create+0x32
> #4  sys_acct+0xfe
> #5  syscall+0x3f2
> #6  Xsyscall_untramp+0xa9
> creating runtime link editor directory cache.
> preserving editor files.
> starting network daemons: sshd snmpd smtpd sndiod.

We should not introduce more lock order reversals.

And I actually agree with Mateusz: the fact that you can sleep while
holding an rwlock doesn't mean it is a good idea.