Download raw body.
stacktrace_save_utrace() should be more robust
> Date: Wed, 21 May 2025 17:30:56 +0200
> From: Alexandr Nedvedicky <sashan@fastmail.net>
Hi Sasha,
> Hello,
>
> On Wed, May 21, 2025 at 05:07:18PM +0200, Mark Kettenis wrote:
> > > Date: Wed, 21 May 2025 16:02:13 +0200
> > > From: Alexandr Nedvedicky <sashan@fastmail.net>
> > >
> > > Hello,
> > >
> > > On Wed, May 21, 2025 at 01:11:40PM +0200, Mark Kettenis wrote:
> > >
> > > </snip>
> > > >
> > > > That seems like a reasonable assumption. But we should probably avoid
> > > > the copyin() altogether (and just terminate the trace) if f.f_frame is
> > > > a kernel address. So something like:
> > > >
> > > > if (userland == 1 && INKERNEL(f.f_frame))
> > > > break;
> > > >
> > > > somewhere near the top of the while loop.
> > > >
> > >
> > > yes, it makes perfect sense. thanks for the suggestion.
> > > updated diff follows. it still works on i386.
> >
> > But you just moved the f.f_retaddr check, I'm talking about adding an
> > f.f_frame check.
> >
>
> I'm afraid I don't follow. I'm going to paste the complete
> stacktrace_save_utrace() function here after applying the latest diff:
>
>
> 1 void
> 2 stacktrace_save_utrace(struct stacktrace *st)
> 3 {
> 4 struct callframe f, *frame, *lastframe;
> 5 struct pcb *pcb = curpcb;
> 6 int userland = 0;
> 7
> 8 st->st_count = 0;
> 9
> 10 if (pcb == NULL)
> 11 return;
> 12
> 13 frame = __builtin_frame_address(0);
> 14 KASSERT(INKERNEL(frame));
> 15 f = *frame;
> 16
> 17 curcpu()->ci_inatomic++;
> 18 while (st->st_count < STACKTRACE_MAX) {
> 19 /*
> 20 * Don't trust f.f_retaddr if it came from userland via copyin
> 21 */
> 22 if (userland == 1 && INKERNEL(f.f_retaddr))
> 23 break;
> 24
> 25 if (f.f_retaddr != 0 && !INKERNEL(f.f_retaddr))
> 26 st->st_pc[st->st_count++] = f.f_retaddr;
> 27
> 28 lastframe = frame;
> 29 frame = f.f_frame;
> 30
> 31 if (frame == NULL)
> 32 break;
> 33 if (INKERNEL(f.f_retaddr)) {
> 34 if (frame <= lastframe)
> 35 break;
> 36 f = *frame;
> 37 continue;
> 38 }
> 39 if (!INKERNEL(lastframe) && frame <= lastframe)
> 40 break;
> 41 if (copyin(frame, &f, sizeof(f)) != 0)
> 42 break;
> 43 userland = 1;
> 44 }
> 45 curcpu()->ci_inatomic--;
> 46 }
>
> If I understand things right we always start with backtrace in
> kernel space (according to KASSERT() at line 14.
>
> In the first iteration of while loop the userland is zero,
> therefore we never take a branch at line 22.
>
> at line 36 we move to the next frame and jump back to the
> beginning of while loop (line 37).
>
> This goes that way as long as we are walking up the stack
> in kernel. As soon as we reach to the frame where return
> address is from userland we won't take branch at line
> 33 and continue execution at line 39.
>
> Line 39 makes sure we don't run out from the stack.
>
> at line 41 we do copyin() the next frame from userland.
> at line 43 we make a note that all frames which will
> follow must be coming from userland. So we won't
> get eventually confused if userland stack is either
> corrupted or its stack frame is not set up.
Sure, but if the userland stack is corrupted, the copyin on line 41
will read a frame that may also have a bogus frame pointer. So what
may happen is that f.f_retaddr is a plausible userland return address,
but f.f_frame point somewhere in kernel memory. In that case we'll
end up with a copyin that tries to read from kernel memory the next
time around. That worried me a bit, but I suppose that'll just fail
with an EFAULT.
stacktrace_save_utrace() should be more robust