Index | Thread | Search

From:
Mark Kettenis <mark.kettenis@xs4all.nl>
Subject:
Re: stacktrace_save_utrace() should be more robust
To:
Alexandr Nedvedicky <sashan@fastmail.net>
Cc:
tech@openbsd.org
Date:
Thu, 22 May 2025 00:02:34 +0200

Download raw body.

Thread
> Date: Wed, 21 May 2025 17:30:56 +0200
> From: Alexandr Nedvedicky <sashan@fastmail.net>

Hi Sasha,

> Hello,
> 
> On Wed, May 21, 2025 at 05:07:18PM +0200, Mark Kettenis wrote:
> > > Date: Wed, 21 May 2025 16:02:13 +0200
> > > From: Alexandr Nedvedicky <sashan@fastmail.net>
> > > 
> > > Hello,
> > > 
> > > On Wed, May 21, 2025 at 01:11:40PM +0200, Mark Kettenis wrote:
> > > 
> > > </snip>
> > > > 
> > > > That seems like a reasonable assumption.  But we should probably avoid
> > > > the copyin() altogether (and just terminate the trace) if f.f_frame is
> > > > a kernel address.  So something like:
> > > > 
> > > >                 if (userland == 1 && INKERNEL(f.f_frame))
> > > > 		        break;
> > > > 
> > > > somewhere near the top of the while loop.
> > > > 
> > > 
> > >     yes, it makes perfect sense. thanks for the suggestion.
> > >     updated diff follows. it still works on i386.
> > 
> > But you just moved the f.f_retaddr check, I'm talking about adding an
> > f.f_frame check.
> > 
> 
> I'm afraid I don't follow. I'm going to paste the complete
> stacktrace_save_utrace() function here after applying the latest diff:
> 
> 
>      1	void
>      2	stacktrace_save_utrace(struct stacktrace *st)
>      3	{
>      4		struct callframe f, *frame, *lastframe;
>      5		struct pcb *pcb = curpcb;
>      6		int userland = 0;
>      7	
>      8		st->st_count = 0;
>      9	
>     10		if (pcb == NULL)
>     11			return;
>     12	
>     13		frame = __builtin_frame_address(0);
>     14		KASSERT(INKERNEL(frame));
>     15		f = *frame;
>     16	
>     17		curcpu()->ci_inatomic++;
>     18		while (st->st_count < STACKTRACE_MAX) {
>     19			/*
>     20			 * Don't trust f.f_retaddr if it came from userland via copyin
>     21			 */
>     22			if (userland == 1 && INKERNEL(f.f_retaddr))
>     23				break;
>     24	
>     25			if (f.f_retaddr != 0 && !INKERNEL(f.f_retaddr))
>     26				st->st_pc[st->st_count++] = f.f_retaddr;
>     27	
>     28			lastframe = frame;
>     29			frame = f.f_frame;
>     30	
>     31			if (frame == NULL)
>     32				break;
>     33			if (INKERNEL(f.f_retaddr)) {
>     34				if (frame <= lastframe)
>     35					break;
>     36				f = *frame;
>     37				continue;
>     38			}
>     39			if (!INKERNEL(lastframe) && frame <= lastframe)
>     40				break;
>     41			if (copyin(frame, &f, sizeof(f)) != 0)
>     42				break;
>     43			userland = 1;
>     44		}
>     45		curcpu()->ci_inatomic--;
>     46	}
> 
> If I understand things right we always start with backtrace in
> kernel space (according to KASSERT() at line 14.
> 
> In the first iteration of while loop the userland is zero,
> therefore we never take a branch at line 22.
> 
> at line 36 we move to the next frame and jump back to the
> beginning of while loop (line 37).
> 
> This goes that way as long as we are walking up the stack
> in kernel. As soon as we reach to the frame where return
> address is from userland we won't take branch at line
> 33 and continue execution at line 39.
> 
> Line 39 makes sure we don't run out from the stack.
> 
> at line 41 we do copyin() the next frame from userland.
> at line 43 we make a note that all frames which will
> follow must be coming from userland. So we won't
> get eventually confused if userland stack is either
> corrupted or its stack frame is not set up.

Sure, but if the userland stack is corrupted, the copyin on line 41
will read a frame that may also have a bogus frame pointer.  So what
may happen is that f.f_retaddr is a plausible userland return address,
but f.f_frame point somewhere in kernel memory.  In that case we'll
end up with a copyin that tries to read from kernel memory the next
time around.  That worried me a bit, but I suppose that'll just fail
with an EFAULT.