From: Mark Kettenis Subject: Re: stacktrace_save_utrace() should be more robust To: Alexandr Nedvedicky Cc: tech@openbsd.org Date: Thu, 22 May 2025 00:02:34 +0200 > Date: Wed, 21 May 2025 17:30:56 +0200 > From: Alexandr Nedvedicky Hi Sasha, > Hello, > > On Wed, May 21, 2025 at 05:07:18PM +0200, Mark Kettenis wrote: > > > Date: Wed, 21 May 2025 16:02:13 +0200 > > > From: Alexandr Nedvedicky > > > > > > Hello, > > > > > > On Wed, May 21, 2025 at 01:11:40PM +0200, Mark Kettenis wrote: > > > > > > > > > > > > > > That seems like a reasonable assumption. But we should probably avoid > > > > the copyin() altogether (and just terminate the trace) if f.f_frame is > > > > a kernel address. So something like: > > > > > > > > if (userland == 1 && INKERNEL(f.f_frame)) > > > > break; > > > > > > > > somewhere near the top of the while loop. > > > > > > > > > > yes, it makes perfect sense. thanks for the suggestion. > > > updated diff follows. it still works on i386. > > > > But you just moved the f.f_retaddr check, I'm talking about adding an > > f.f_frame check. > > > > I'm afraid I don't follow. I'm going to paste the complete > stacktrace_save_utrace() function here after applying the latest diff: > > > 1 void > 2 stacktrace_save_utrace(struct stacktrace *st) > 3 { > 4 struct callframe f, *frame, *lastframe; > 5 struct pcb *pcb = curpcb; > 6 int userland = 0; > 7 > 8 st->st_count = 0; > 9 > 10 if (pcb == NULL) > 11 return; > 12 > 13 frame = __builtin_frame_address(0); > 14 KASSERT(INKERNEL(frame)); > 15 f = *frame; > 16 > 17 curcpu()->ci_inatomic++; > 18 while (st->st_count < STACKTRACE_MAX) { > 19 /* > 20 * Don't trust f.f_retaddr if it came from userland via copyin > 21 */ > 22 if (userland == 1 && INKERNEL(f.f_retaddr)) > 23 break; > 24 > 25 if (f.f_retaddr != 0 && !INKERNEL(f.f_retaddr)) > 26 st->st_pc[st->st_count++] = f.f_retaddr; > 27 > 28 lastframe = frame; > 29 frame = f.f_frame; > 30 > 31 if (frame == NULL) > 32 break; > 33 if (INKERNEL(f.f_retaddr)) { > 34 if (frame <= lastframe) > 35 break; > 36 f = *frame; > 37 continue; > 38 } > 39 if (!INKERNEL(lastframe) && frame <= lastframe) > 40 break; > 41 if (copyin(frame, &f, sizeof(f)) != 0) > 42 break; > 43 userland = 1; > 44 } > 45 curcpu()->ci_inatomic--; > 46 } > > If I understand things right we always start with backtrace in > kernel space (according to KASSERT() at line 14. > > In the first iteration of while loop the userland is zero, > therefore we never take a branch at line 22. > > at line 36 we move to the next frame and jump back to the > beginning of while loop (line 37). > > This goes that way as long as we are walking up the stack > in kernel. As soon as we reach to the frame where return > address is from userland we won't take branch at line > 33 and continue execution at line 39. > > Line 39 makes sure we don't run out from the stack. > > at line 41 we do copyin() the next frame from userland. > at line 43 we make a note that all frames which will > follow must be coming from userland. So we won't > get eventually confused if userland stack is either > corrupted or its stack frame is not set up. Sure, but if the userland stack is corrupted, the copyin on line 41 will read a frame that may also have a bogus frame pointer. So what may happen is that f.f_retaddr is a plausible userland return address, but f.f_frame point somewhere in kernel memory. In that case we'll end up with a copyin that tries to read from kernel memory the next time around. That worried me a bit, but I suppose that'll just fail with an EFAULT.