From: Mateusz Guzik Subject: concerning vfs_stall_barrier() To: claudio@openbsd.org Cc: tech-openbsd Date: Sat, 13 Sep 2025 10:44:25 +0200 One of the things which irk me about OpenBSD is vfs_stall_barrier(). The placement in FREF is racy as the consumer can take the trip through the lock just before writes are supposed to be frozen, effectively defeating the mechanism. I presume this got recognized and there was an attempt to remedy the problem in: commit fae85b1e6961525c0b9017bb021d98e19c2e4451 Author: claudio Date: Thu Oct 21 09:59:13 2021 +0000 Move vfs_stall_barrier() from the fd layer into vn_lock() and the vfs layer. vfs stalling is used by suspend/resume and by vmt(4) to stall any filesystem operation from altering the state on disk. All these operations will call vn_lock and be stalled. Adjust vfs_stall_barrier() to allow the lock owner to still progress so that suspend can sync the filesystems after stalling vfs operation. OK mpi@ However, that got reverted in: commit 8e49fb4c2a30057c387ac447b5598aa90b4ad4fa Author: claudio Date: Mon Oct 25 10:24:54 2021 +0000 Revert commitid: ufM9BcSbXqfLpzBH; Move vfs_stall_barrier() from the fd layer into vn_lock() and the vfs layer. In some cases it can result in a deadlock while suspending. Discussed with mpi@ and deraadt@ I failed to find public discussions on the matter. Even so, I can easily imagine how placement in vn_lock results in deadlocks. Pick a consumer which locks 2 vnodes (rename as an example) and suppose it races against a thread stalling vfs ops: CPU0 CPU1 lock vnode1 stall ops stalled locking vnode2 lock vnode1 Now both threads are waiting on each other. The correct (albeit time-consuming) way to approach this is to maintain a counter of writes in progress (this includes vnode teardown) and only bump it while none of the vnodes are locked yet *or* do a "trywrite" and failing to get there, unlock, sleep and retry. For a working example I refer you to FreeBSD, also pointing out nasty spots which need this kind of treatment.