From: Alexander Bluhm Subject: Re: fstat sysctl crashes kernel To: Vitaliy Makkoveev Cc: tech@openbsd.org Date: Fri, 9 Aug 2024 10:20:41 +0200 On Thu, Aug 08, 2024 at 09:27:36PM +0300, Vitaliy Makkoveev wrote: > > On 8 Aug 2024, at 20:46, Alexander Bluhm wrote: > > > > On Thu, Aug 08, 2024 at 09:47:27AM -0600, Theo de Raadt wrote: > >> Todd C. Miller wrote: > >> > >>> On Thu, 08 Aug 2024 13:21:54 +0200, Alexander Bluhm wrote: > >>> > >>>> Independent of that, can we commit the chunk below? When searching > >>>> for a specific process, there is not need to traverse the list to > >>>> the end. Also I check for arg >= 0 as this is cosistent with the > >>>> arg < -1 check before. Then you don't have to think about what is > >>>> special about process 0. There is no functional difference as > >>>> process 0 has PS_SYSTEM set and is skipped anyway. > >>> > >>> Yes, this makes sense. OK millert@ > >> > >> No matter what we do, we don't get an atomic view. Punting the problem > >> to userland sidesteps the difficult problem in the kernel. > > > > For me is important that we commit something instead of discussing > > pro and con of various diffs. > > > > The idea with the marker sounds like the correct thing, but adds a > > lot of complexity. mvs@ already worked on something simmilar. The > > easiest fix without locking concerns is my userland fstat diff. > > Locking the whole process list while doing copyout over NFS may > > open another can of worms. > > > > Userland pages are wired at this moment, you can???t stuck in swap > on nfs here, sysctl_vslock() cares about that. > > > Both bluhm@ and mvs@ diff survive my stress test for hours. Usually > > the test crashes or hangs within minutes. Currently I am testing > > with witness to find fallout. > > > > Nice to hear. Personal me, I prefer allprocess rwlock(9). It is much > simpler and left userland as is. Also, is allows to remove kernel > lock around some process stuff like prfind(). For example, from the > filt_procattach() where claudio@ left it to make ps dereference safe. mvs@ diff has finished witness testing. It complains already during boot. reordering: ld.so libc libcrypto sshd sshd-session ssh-agent. starting early daemons: syslogd pflogd ntpd. starting RPC daemons:. witness: lock order reversal: 1st 0xd7af9474 inode (&ip->i_lock) 2nd 0xd0f32d68 allpslk (&allprocess_lock) lock order [1] inode (&ip->i_lock) -> [2] allpslk (&allprocess_lock) lock order data 0xd0ca8820 -> 0xd0cf51b4 is missing lock order [2] allpslk (&allprocess_lock) -> [1] inode (&ip->i_lock) #0 witness_checkorder+0x291 #1 rw_enter+0x4d #2 rrw_enter+0x41 #3 ufs_lock+0x27 #4 VOP_LOCK+0x50 #5 vn_lock+0x90 #6 vrele+0x35 #7 unveil_destroy+0x76 #8 exit1+0x378 #9 sys_exit+0x14 #10 syscall+0x3f2 #11 Xsyscall_untramp+0xa9 savecore: no core dump checking quotas: done. clearing /tmp kern.securelevel: 0 -> 1 turning on accounting witness: lock order reversal: 1st 0xd0f32d68 allpslk (&allprocess_lock) 2nd 0xd0ed1114 acctlk (acctlk) lock order [1] allpslk (&allprocess_lock) -> [2] acctlk (acctlk) #0 witness_checkorder+0x291 #1 rw_enter_read+0x32 #2 acct_process+0x38 #3 exit1+0x35d #4 sys_exit+0x14 #5 syscall+0x3f2 #6 Xsyscall_untramp+0xa9 lock order [2] acctlk (acctlk) -> [1] allpslk (&allprocess_lock) #0 witness_checkorder+0x291 #1 rw_enter_write+0x30 #2 fork1+0x138 #3 kthread_create+0x32 #4 sys_acct+0xfe #5 syscall+0x3f2 #6 Xsyscall_untramp+0xa9 creating runtime link editor directory cache. preserving editor files. starting network daemons: sshd snmpd smtpd sndiod.