Download raw body.
SYS_pinsyscalls question
On Sat, Feb 22, 2025 at 04:08:17PM +0000, Marc Espie wrote:
> I was showing our magnificent pinsyscalls(2) framework to students last week.
>
> One question that came up: why is there a |MZERO
> in the malloc for the pins table (around line 360 of uvm_mmap.c)
>
> Digging thru the commit history shows one single commit by Theo.
>
> As far as I can tell, the next copyin will copy over the full allocation
> anyway (exact same size).
>
> The only possible explanation I could fathom is that malloc will round up
> the allocation size, thus there might be data in the roundup area from the
> last allocation.
>
> But, as far as I can tell, M_PINSYSCALL is only used by that subsystem,
> so any remains would have to come from a previous allocation corresponding
> to a process that no longer exist.
>
> Either that process was statically linked, and the pins table is likely
> to be much smaller, or the process is dynamically linked, in which the
> table will be maxsize.
>
> The only case where I actually see a potential problem is when you have
> several libcs with a differing number of syscalls on the system, where
> rounded up allocations *might* leak offsets to the recently added
> syscalls as opposed to the (mostly fixed) array of classic syscalls.
>
> Is this the reason for the MZERO ?
The type (M_) thing is not used to separate any memory, is is merely
used for accounting.
AFAICS this pre-zeroing can be avoided.
I'm responding because I think the syscall can be used to cause memory
leaks, or at least that would be possible to do on other systems with
this code.
First, I realize the syscall is still operating under the big kernel lock.
For a normal single-threaded case I assume repeat calls are going to
error out early as pr->ps_libcpin.pn_start will be set > 0. Albeit I
think this is a bad thing to rely on -- I would add a flag that the
table got pinned already.
Suppose someone crafted a static binary which spawns threads, which then
proceed to call pinsyscalls (and the call was not issued prior). I'm
assuming this is is possible on OpenBSD.
Further assume the addresses to copy from are backed by not-yet-read
files.
Then this copyin:
error = copyin(SCARG(uap, pins), pins, npins * sizeof(u_int));
if (error)
goto err;
will go off cpu and drop the kernel lock, allowing other threads to
reach the same point, all still observing pr->ps_libcpin.pn_start == 0.
Then several threads will reach this assignment:
pr->ps_libcpin.pn_start = base;
is going to override all previously set bufs, leaking them.
SYS_pinsyscalls question