From: Crystal Kolipe <kolipe.c@exoticsilicon.com>
Subject: mfs silent data loss, bio memory alloc issues
To: tech@openbsd.org
Date: Sun, 14 Dec 2025 14:48:23 +0000

There is a subtle bug that can cause MFS to write corrupt data to the
in-memory FFS filesystem, (typically visible as pages of 0x00 bytes instead of
user data).

It's difficult to reproduce in normal operation, but fairly easy in a
synthetic test case.

Basically, in a low-memory condition (*), the kernel address supplied to
copyout() in mfs_doio() is invalid, and causes copyout to throw EFAULT.  This
sets B_ERROR in bp->b_flags, but the MFS process otherwise continues and
writes nothing to the user memory backing the FFS blocks corresponding to that
I/O request.

(*) But not simply memory exhaustion - obviously MFS will mis-behave when the
    total memory allocated starts to exceed available physical RAM, but this
    is not the case here, and this bug appears to be different.

For example on a test machine with around 940 Mb of available RAM, allocating
892 Mb to MFS of which 851 Mb is actually used, the bug is reproducible.

Of note is that second and subsequent runs of the same reproducer almost
always result in correct behaviour, (until the next reboot), in other words
there _is_ enough physical memory to, (repeatedly), handle the MFS storage
request without the system becoming unstable.

Even worse, whereas the more common failure mode has obvious effects, (the
system becomes unstable and processes are killed), this failure just silently
returns corrupt data to the userland process doing the I/O.

Reproducer:

#!/bin/sh
#
# The values given to mount_mfs -s and jot _must_ be tweaked for the exact
# memory size of the test environment!
#
mkdir /ramdisk_1
mount_mfs -s 892m swap /ramdisk_1 || exit
jot 100243865 > /ramdisk_1/file_1
md5 /ramdisk_1/file_1
umount /ramdisk_1
rmdir /ramdisk_1

The values of 892m and 100243865 were adequate for a vm with 1 Gb of ram,
directly launching the kernel, (no BIOS image).

Finding the sweet spot might take some experimentation.

* If the -s parameter is too large, you get regular out of swap issues.
* If the -s parameter is too small, you don't hit the bug at all.
* Obviously the output from jot shouldn't exceed the capacity of the MFS
  filesystem either.

Anyway, here is example output from a vm that triggers the bug:

MD5 (/ramdisk_1/file_1) = 33f5967244df27b3be7db52f194c4c02
MD5 (/ramdisk_1/file_1) = f26ccaedba986fe043491f8889660161
MD5 (/ramdisk_1/file_1) = f26ccaedba986fe043491f8889660161
MD5 (/ramdisk_1/file_1) = f26ccaedba986fe043491f8889660161

After a reboot:

MD5 (/ramdisk_1/file_1) = 7c42158a81a816e7f18ad62b810cb17e
MD5 (/ramdisk_1/file_1) = f26ccaedba986fe043491f8889660161
MD5 (/ramdisk_1/file_1) = f26ccaedba986fe043491f8889660161
MD5 (/ramdisk_1/file_1) = f26ccaedba986fe043491f8889660161

After increasing the memory allocated to the vm from 1 Gb to 2 Gb:

MD5 (/ramdisk_1/file_1) = f26ccaedba986fe043491f8889660161
MD5 (/ramdisk_1/file_1) = f26ccaedba986fe043491f8889660161
MD5 (/ramdisk_1/file_1) = f26ccaedba986fe043491f8889660161
MD5 (/ramdisk_1/file_1) = f26ccaedba986fe043491f8889660161

The MFS code itself is pretty straightforward and I can't see any obvious
issues there.  I'm currently looking through the bio memory allocation code,
but still not found anything obvious yet either.

I've tested on 7.8-release and 7.7-release, and the issue is present in both.

Tested on physical hardware and vmd vms, reproducible on both.