From: Martin Pieuchot Subject: Re: concerning vfs_stall_barrier() To: Mateusz Guzik Cc: claudio@openbsd.org, tech-openbsd Date: Sat, 13 Sep 2025 11:28:01 +0200 On 13/09/25(Sat) 10:44, Mateusz Guzik wrote: > One of the things which irk me about OpenBSD is vfs_stall_barrier(). I feel the same. This is a common idiom in OpenBSD: use a big and ugly hack and forget about it. This is generally done when we're enable to work together to implement better solutions. > The placement in FREF is racy as the consumer can take the trip > through the lock just before writes are supposed to be frozen, > effectively defeating the mechanism. > > I presume this got recognized and there was an attempt to remedy the problem in: I don't believe it was recognized. Well apart from the empirical freezes or crashes. This issue has been present since the original commit. This has always puzzled me. > commit fae85b1e6961525c0b9017bb021d98e19c2e4451 > Author: claudio > Date: Thu Oct 21 09:59:13 2021 +0000 > > Move vfs_stall_barrier() from the fd layer into vn_lock() and the vfs layer. > vfs stalling is used by suspend/resume and by vmt(4) to stall any > filesystem operation from altering the state on disk. All these > operations will call vn_lock and be stalled. Adjust vfs_stall_barrier() > to allow the lock owner to still progress so that suspend can sync > the filesystems after stalling vfs operation. > OK mpi@ > > However, that got reverted in: > commit 8e49fb4c2a30057c387ac447b5598aa90b4ad4fa > Author: claudio > Date: Mon Oct 25 10:24:54 2021 +0000 > > Revert commitid: ufM9BcSbXqfLpzBH; > Move vfs_stall_barrier() from the fd layer into vn_lock() and the vfs layer. > In some cases it can result in a deadlock while suspending. > Discussed with mpi@ and deraadt@ > > I failed to find public discussions on the matter. There haven't been any. I wish it would have been different from the beginning. > Even so, I can easily imagine how placement in vn_lock results in deadlocks. > > Pick a consumer which locks 2 vnodes (rename as an example) and > suppose it races against a thread stalling vfs ops: > > CPU0 CPU1 > lock vnode1 > stall ops > stalled locking vnode2 > lock vnode1 > > Now both threads are waiting on each other. Indeed. > The correct (albeit time-consuming) way to approach this is to > maintain a counter of writes in progress (this includes vnode > teardown) and only bump it while none of the vnodes are locked yet > *or* do a "trywrite" and failing to get there, unlock, sleep and > retry. For a working example I refer you to FreeBSD, also pointing out > nasty spots which need this kind of treatment. Sure, could you send a diff for that? Thanks!