Index | Thread | Search

From:
"Theo de Raadt" <deraadt@openbsd.org>
Subject:
clang -fret-clean: cleaning return addresses off stack
To:
tech@cvs.openbsd.org
Date:
Sat, 25 May 2024 00:18:59 -0600

Download raw body.

Thread
  • Theo de Raadt:

    clang -fret-clean: cleaning return addresses off stack

There are many address space mitigations in play now which make standard
control-flow methods and ROP-style methods more difficult than ever before.
None of them are a silver bullet; added up they are a big deal, but noone
is saying they are a comprehensive solution,

One thing I've worried about for a while is that program bugs being
exercised tend to happen in the main program, or in some large library.
But many types of attack methodology require reaching system calls via
libc, in as direct and simple fashion as possible.  ASLR location of
libc has made that a bit harder, boot-time random relinking of libc
makes it even more difficult.  But there's a few things which do hint at
where libc is mapped.

The GOT/PLT point at functions in it, and indeed specific libc methods are
reached via them.  The introduction of Xonly (now on most of our architectures)
has made that a bit harder to play with.  But anyways... GOT/PLT changes is
not where I'm going in this mail.

The other thing is that the dead-part of the stack contains specific
addresses to libc code.  Imagine if just before the bug being tickled,
previous code has done a fairly deep and local-variable heavy
call-stack, which made it into some libc interface -- it could be a
system call, or something higher level like stdio.  Upon return closer
to the bug, the dead stack will contain return addresses (the address
after those "callq" inside libc).  An attacker can know the absolute
offset relative to the stack where these values are, and in their early
attack code load those values, and find a way to use them.

Random relinking of libc, and a historical pattern of turning libc into
tons of little .o files, means the attacker can't neccessarily call
other functions, but they can play with functions in the call graph.
They can't read the relevant callq instruction to inspect the called
address, because the text segment isn't readable (hello Xonly), so they
can't directly call what was being called.

So there are two possible outcomes.

1) they have something like printf -> fprintf -> __vfprintf, and hooks to
the actual output functions, a bit difficult to play with, but it's "not nothing".

2) or imagine they have the actual libc.so address space known (because this
is not a remote attack), they now know the offsets of all the system call stubs,
relative to this callq return.

So here's a demo / proposal, for x86 only at first.  This is a new
-fret-clean option in clang, which is used to compile the following:

   libc, libcrypto, ld.so, all the ssh binaries, and the kernel

It is a compiler pass that changes the following:

   callq	 something

into

   callq	 something
   movq		 $0, -8(%rsp)

The callq instruction puts the location of the movq instruction onto the
stack for returning to.  The change is that now, upon return to caller,
the caller itself cleans that value off the stack.  If I got it right :)

A review of the stack after this shows substantial sanitization.  This
doesn't fix all pointers you can see on the stack, just this one type of
'pointer'.  Local variable pointers inside call frames will remain, but
that's harder for an attacker to exercise.

I'm not proposing that we compile many main programs with this.  The retval
data hiding being proposed is most useful near the syscall boundary in libc.so
(and ld.so).  In dynamic libc.so, pinsyscalls(2) prevents one syscall stub
from being a chimera for other operations, but you will still find all the
stubs in libc.so lying around in some random place.

I haven't written a version for any other architectures yet, but I have
a vague idea of what to do.  Some of them use link-register calling
convention, and the retval cleaning will be in the caller, not the
callee.  That's a little bit better because it also hides the retval one
frame higher.

Anyways, food for thought.


Index: gnu/llvm/clang/include/clang/Driver/Options.td
===================================================================
RCS file: /cvs/src/gnu/llvm/clang/include/clang/Driver/Options.td,v
diff -u -p -u -r1.7 Options.td
--- gnu/llvm/clang/include/clang/Driver/Options.td	11 Nov 2023 18:20:20 -0000	1.7
+++ gnu/llvm/clang/include/clang/Driver/Options.td	14 May 2024 14:50:56 -0000
@@ -2857,6 +2857,10 @@ def fno_fixup_gadgets : Flag<["-"], "fno
   HelpText<"Disable FixupGadgets pass (x86 only)">;
 def ffixup_gadgets : Flag<["-"], "ffixup-gadgets">, Group<f_Group>, Flags<[CoreOption]>,
   HelpText<"Replace ROP friendly instructions with safe alternatives (x86 only)">;
+def fno_ret_clean : Flag<["-"], "fno-ret-clean">, Group<f_Group>, Flags<[CoreOption]>,
+  HelpText<"Disable ret-clean pass">;
+def fret_clean : Flag<["-"], "fret-clean">, Group<f_Group>, Flags<[CoreOption]>,
+  HelpText<"Clean return address from stack after call">;
 def ftrivial_auto_var_init_stop_after : Joined<["-"], "ftrivial-auto-var-init-stop-after=">, Group<f_Group>,
   Flags<[CC1Option, CoreOption]>, HelpText<"Stop initializing trivial automatic stack variables after the specified number of instances">,
   MarshallingInfoInt<LangOpts<"TrivialAutoVarInitStopAfter">>;
Index: gnu/llvm/clang/lib/Driver/ToolChains/Clang.cpp
===================================================================
RCS file: /cvs/src/gnu/llvm/clang/lib/Driver/ToolChains/Clang.cpp,v
diff -u -p -u -r1.14 Clang.cpp
--- gnu/llvm/clang/lib/Driver/ToolChains/Clang.cpp	20 Nov 2023 01:35:21 -0000	1.14
+++ gnu/llvm/clang/lib/Driver/ToolChains/Clang.cpp	14 May 2024 14:51:51 -0000
@@ -6402,6 +6402,16 @@ void Clang::ConstructJob(Compilation &C,
       CmdArgs.push_back(Args.MakeArgString(Twine("-x86-fixup-gadgets=true")));
   }
 
+  // -ret-clean
+  if (Arg *A = Args.getLastArg(options::OPT_fno_ret_clean,
+                               options::OPT_fret_clean)) {
+    CmdArgs.push_back(Args.MakeArgString(Twine("-mllvm")));
+    if (A->getOption().matches(options::OPT_fno_ret_clean))
+      CmdArgs.push_back(Args.MakeArgString(Twine("-x86-ret-clean=false")));
+    else if (A->getOption().matches(options::OPT_fret_clean))
+      CmdArgs.push_back(Args.MakeArgString(Twine("-x86-ret-clean=true")));
+  }
+
   RenderSCPOptions(TC, Args, CmdArgs);
   RenderTrivialAutoVarInitOptions(D, TC, Args, CmdArgs);
 
Index: gnu/llvm/llvm/lib/Target/X86/X86.h
===================================================================
RCS file: /cvs/src/gnu/llvm/llvm/lib/Target/X86/X86.h,v
diff -u -p -u -r1.4 X86.h
--- gnu/llvm/llvm/lib/Target/X86/X86.h	11 Nov 2023 18:14:32 -0000	1.4
+++ gnu/llvm/llvm/lib/Target/X86/X86.h	28 Apr 2024 00:47:05 -0000
@@ -132,6 +132,10 @@ FunctionPass *createX86DomainReassignmen
 /// ROP friendly instructions with alternatives.
 FunctionPass *createX86FixupGadgetsPass();
 
+/// Return a Machine Function pass that attempts to replace
+/// RET instructions with a cleaning sequence
+FunctionPass *createX86RetCleanPass();
+
 /// This pass replaces EVEX encoded of AVX-512 instructiosn by VEX
 /// encoding when possible in order to reduce code size.
 FunctionPass *createX86EvexToVexInsts();
Index: gnu/llvm/llvm/lib/Target/X86/X86RetClean.cpp
===================================================================
RCS file: gnu/llvm/llvm/lib/Target/X86/X86RetClean.cpp
diff -N gnu/llvm/llvm/lib/Target/X86/X86RetClean.cpp
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ gnu/llvm/llvm/lib/Target/X86/X86RetClean.cpp	28 Apr 2024 00:47:05 -0000
@@ -0,0 +1,115 @@
+//===-- X86RetClean.cpp - Clean Retaddr off stack upon function return ----===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+/// \file
+/// This file defines a function pass that clears the ret-address from
+/// the top of the stack, immediately upon return to the caller, the goal
+/// is remove this subtle but powerful info-leak which hints at the
+/// address space location of the lower level libraries.
+///
+//===----------------------------------------------------------------------===//
+
+#include "X86.h"
+#include "X86InstrBuilder.h"
+#include "X86InstrInfo.h"
+#include "X86MachineFunctionInfo.h"
+#include "X86Subtarget.h"
+#include "X86TargetMachine.h"
+#include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+#include "llvm/CodeGen/MachineRegisterInfo.h"
+#include "llvm/CodeGen/Passes.h"
+#include "llvm/MC/MCAsmInfo.h"
+#include "llvm/MC/MCSymbol.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/Support/raw_ostream.h"
+using namespace llvm;
+
+#define RETCLEAN_DESC "X86 Ret Clean"
+#define RETCLEAN_NAME "x86-ret-clean"
+
+#define DEBUG_TYPE RETCLEAN_NAME
+
+// Toggle with cc1 option: -mllvm -x86-ret-clean=<true|false>
+static cl::opt<bool> RetClean(
+    "x86-ret-clean", cl::Hidden,
+    cl::desc("clean return address off stack after call"),
+    cl::init(false));
+
+namespace {
+class RetCleanPass : public MachineFunctionPass {
+
+public:
+  static char ID;
+
+  StringRef getPassName() const override { return RETCLEAN_DESC; }
+
+  RetCleanPass()
+      : MachineFunctionPass(ID) {}
+
+  /// Loop over all the instructions and replace ret with ret+clean
+  bool runOnMachineFunction(MachineFunction &MF) override;
+
+  MachineFunctionProperties getRequiredProperties() const override {
+    return MachineFunctionProperties().set(
+        MachineFunctionProperties::Property::NoVRegs);
+  }
+
+private:
+  bool fixupInstruction(MachineFunction &MF, MachineBasicBlock &MBB,
+                        MachineInstr &MI);
+};
+char RetCleanPass::ID = 0;
+} // namespace
+
+FunctionPass *llvm::createX86RetCleanPass() {
+  return new RetCleanPass();
+}
+
+bool RetCleanPass::fixupInstruction(MachineFunction &MF,
+                               MachineBasicBlock &MBB,
+                               MachineInstr &MI) {
+
+  const X86InstrInfo *TII = MF.getSubtarget<X86Subtarget>().getInstrInfo();
+  bool Is64Bit = MF.getTarget().getTargetTriple().getArch() == Triple::x86_64;
+  unsigned Opc = Is64Bit ? X86::MOV64mi32 : X86::MOV32mi;
+  unsigned Offset = Is64Bit ? -8 : -4;
+  Register SPReg = Is64Bit ? X86::RSP : X86::ESP;
+
+  // add "movq $0, -8(%rsp)" (or similar) in caller, to clear the
+  // ret-addr info-leak off the stack
+  addRegOffset(BuildMI(MBB, MI, MI.getDebugLoc(), TII->get(Opc)),
+    SPReg, false, Offset)
+    .addImm(0);
+  return true;
+}
+
+bool RetCleanPass::runOnMachineFunction(MachineFunction &MF) {
+  if (!RetClean)
+    return false;
+
+  bool modified = false;
+
+  for (auto &MBB : MF) {
+    std::vector<MachineInstr*> fixups;
+    bool foundcall = false;
+
+    for (auto &MI : MBB) {
+      if (MI.isCall()) {
+        foundcall = true;	// queue the insert before the next MI
+      } else if (foundcall) {
+         fixups.push_back(&MI);
+         foundcall = false;
+      }
+    }
+    for (auto *fixup : fixups)
+      modified |= fixupInstruction(MF, MBB, *fixup);
+  }
+  return modified;
+}
Index: gnu/llvm/llvm/lib/Target/X86/X86TargetMachine.cpp
===================================================================
RCS file: /cvs/src/gnu/llvm/llvm/lib/Target/X86/X86TargetMachine.cpp,v
diff -u -p -u -r1.4 X86TargetMachine.cpp
--- gnu/llvm/llvm/lib/Target/X86/X86TargetMachine.cpp	11 Nov 2023 18:14:32 -0000	1.4
+++ gnu/llvm/llvm/lib/Target/X86/X86TargetMachine.cpp	28 Apr 2024 00:47:05 -0000
@@ -596,6 +596,8 @@ void X86PassConfig::addPreEmitPass2() {
   addPass(createX86IndirectThunksPass());
   addPass(createX86ReturnThunksPass());
 
+  addPass(createX86RetCleanPass());
+
   // Insert extra int3 instructions after trailing call instructions to avoid
   // issues in the unwinder.
   if (TT.isOSWindows() && TT.getArch() == Triple::x86_64)
Index: gnu/usr.bin/clang/libLLVMX86CodeGen/Makefile
===================================================================
RCS file: /cvs/src/gnu/usr.bin/clang/libLLVMX86CodeGen/Makefile,v
diff -u -p -u -r1.18 Makefile
--- gnu/usr.bin/clang/libLLVMX86CodeGen/Makefile	11 Nov 2023 18:35:38 -0000	1.18
+++ gnu/usr.bin/clang/libLLVMX86CodeGen/Makefile	28 Apr 2024 00:47:05 -0000
@@ -25,6 +25,7 @@ SRCS+=	X86AsmPrinter.cpp \
 	X86FastISel.cpp \
 	X86FixupBWInsts.cpp \
 	X86FixupGadgets.cpp \
+	X86RetClean.cpp \
 	X86FixupLEAs.cpp \
 	X86AvoidStoreForwardingBlocks.cpp \
 	X86DynAllocaExpander.cpp \
Index: lib/libc/arch/amd64/Makefile.inc
===================================================================
RCS file: /cvs/src/lib/libc/arch/amd64/Makefile.inc,v
diff -u -p -u -r1.5 Makefile.inc
--- lib/libc/arch/amd64/Makefile.inc	9 Jun 2014 20:47:10 -0000	1.5
+++ lib/libc/arch/amd64/Makefile.inc	14 May 2024 15:07:08 -0000
@@ -1 +1,4 @@
 #	$OpenBSD: Makefile.inc,v 1.5 2014/06/09 20:47:10 miod Exp $
+
+#CFLAGS+=-mllvm -x86-ret-clean=true
+CFLAGS+=-fret-clean
Index: lib/libcrypto/arch/amd64/Makefile.inc
===================================================================
RCS file: /cvs/src/lib/libcrypto/arch/amd64/Makefile.inc,v
diff -u -p -u -r1.27 Makefile.inc
--- lib/libcrypto/arch/amd64/Makefile.inc	29 Mar 2024 11:00:57 -0000	1.27
+++ lib/libcrypto/arch/amd64/Makefile.inc	14 May 2024 15:07:41 -0000
@@ -83,3 +83,6 @@ GENERATED+=x86_64cpuid.S
 x86_64cpuid.S: ${LCRYPTO_SRC}/x86_64cpuid.pl ${EXTRA_PL}
 	(cd ${LCRYPTO_SRC}/${dir} ; \
 		/usr/bin/perl ./x86_64cpuid.pl) > ${.TARGET}
+
+#CFLAGS+=-mllvm -x86-ret-clean=true
+CFLAGS+=-fret-clean
Index: libexec/ld.so/amd64/Makefile.inc
===================================================================
RCS file: /cvs/src/libexec/ld.so/amd64/Makefile.inc,v
diff -u -p -u -r1.7 Makefile.inc
--- libexec/ld.so/amd64/Makefile.inc	20 Oct 2019 03:44:49 -0000	1.7
+++ libexec/ld.so/amd64/Makefile.inc	14 May 2024 15:07:59 -0000
@@ -1,6 +1,8 @@
 #	$OpenBSD: Makefile.inc,v 1.7 2019/10/20 03:44:49 guenther Exp $
 
 CFLAGS += -fPIC -mno-sse2 -mno-sse -mno-3dnow -mno-mmx
+#CFLAGS +=-mllvm -x86-ret-clean=true
+CFLAGS +=-fret-clean
 AFLAGS += -fpic
 LD_SCRIPT = ${.CURDIR}/${MACHINE_CPU}/ld.script
 
Index: share/man/man1/clang-local.1
===================================================================
RCS file: /cvs/src/share/man/man1/clang-local.1,v
diff -u -p -u -r1.23 clang-local.1
--- share/man/man1/clang-local.1	18 Feb 2022 00:39:18 -0000	1.23
+++ share/man/man1/clang-local.1	14 May 2024 15:03:54 -0000
@@ -119,6 +119,13 @@ This can be disabled with the
 option.
 .It
 .Nm clang
+includes a security pass that can clear the return address on the
+stack upon return from calling a function, on i386 and amd64.
+This can be enabled with the
+.Fl fret-clean
+option.
+.It
+.Nm clang
 includes the retguard security feature on amd64, arm64, mips64, powerpc
 and powerpc64.
 This feature can be disabled with the
Index: sys/arch/amd64/conf/Makefile.amd64
===================================================================
RCS file: /cvs/src/sys/arch/amd64/conf/Makefile.amd64,v
diff -u -p -u -r1.134 Makefile.amd64
--- sys/arch/amd64/conf/Makefile.amd64	6 Sep 2023 01:47:36 -0000	1.134
+++ sys/arch/amd64/conf/Makefile.amd64	14 May 2024 15:08:41 -0000
@@ -73,6 +73,8 @@ CMACHFLAGS+=	-mno-retpoline -fcf-protect
 .endif
 .else
 CMACHFLAGS+=	-mretpoline-external-thunk -fcf-protection=branch
+#CMACHFLAGS+=	-mllvm -x86-ret-clean=true
+CMACHFLAGS+=	-fret-clean
 .endif
 .if ${COMPILER_VERSION:Mclang}
 NO_INTEGR_AS=	-no-integrated-as
Index: sys/arch/i386/conf/Makefile.i386
===================================================================
RCS file: /cvs/src/sys/arch/i386/conf/Makefile.i386,v
diff -u -p -u -r1.145 Makefile.i386
--- sys/arch/i386/conf/Makefile.i386	28 Jan 2024 00:40:22 -0000	1.145
+++ sys/arch/i386/conf/Makefile.i386	14 May 2024 15:08:24 -0000
@@ -41,6 +41,8 @@ SORTR=		cat
 COPTIMIZE?=	-Oz -mno-retpoline
 .elif ${COMPILER_VERSION:Mclang}
 CMACHFLAGS+=	-mretpoline
+#CMACHFLAGS+=	-mllvm -x86-ret-clean=true
+CMACHFLAGS+=	-fret-clean
 .endif
 .if ${COMPILER_VERSION:Mclang}
 NO_INTEGR_AS=	-no-integrated-as
Index: usr.bin/ssh/Makefile.inc
===================================================================
RCS file: /cvs/src/usr.bin/ssh/Makefile.inc,v
diff -u -p -u -r1.92 Makefile.inc
--- usr.bin/ssh/Makefile.inc	22 May 2024 15:24:55 -0000	1.92
+++ usr.bin/ssh/Makefile.inc	24 May 2024 15:59:55 -0000
@@ -6,6 +6,9 @@ CFLAGS+=	-I${.CURDIR}/..
 .if ${MACHINE} != "hppa"
 CFLAGS+=	-fstack-protector-all
 .endif
+.if ${MACHINE} == "i386" || ${MACHINE} == "amd64"
+CFLAGS+=	-fret-clean
+.endif
 
 CDIAGFLAGS=	-Wall
 CDIAGFLAGS+=	-Wextra
Index: usr.sbin/crunchgen/crunchgen.c
===================================================================
RCS file: /cvs/src/usr.sbin/crunchgen/crunchgen.c,v
diff -u -p -u -r1.27 crunchgen.c
--- usr.sbin/crunchgen/crunchgen.c	14 Sep 2023 16:39:00 -0000	1.27
+++ usr.sbin/crunchgen/crunchgen.c	28 Apr 2024 20:57:19 -0000
@@ -897,6 +897,7 @@ top_makefile_rules(FILE * outmk)
 	fprintf(outmk, "CFLAGS+=-fno-unwind-tables\n");
 	fprintf(outmk, ".if ${MACHINE} == \"amd64\"\n");
 	fprintf(outmk, "CFLAGS+=-fcf-protection=none\n");
+	fprintf(outmk, "CFLAGS+=-mllvm -x86-ret-clean=false\n");
 	fprintf(outmk, ".endif\n");
 	fprintf(outmk, ".if ${MACHINE} == \"arm64\"\n");
 	fprintf(outmk, "CFLAGS+=-mbranch-protection=none\n");
Index: distrib/amd64/iso/Makefile
===================================================================
RCS file: /cvs/src/distrib/amd64/iso/Makefile,v
diff -u -p -u -r1.47 Makefile
--- distrib/amd64/iso/Makefile	15 Dec 2023 06:03:00 -0000	1.47
+++ distrib/amd64/iso/Makefile	29 Apr 2024 01:42:08 -0000
@@ -1,7 +1,7 @@
 #	$OpenBSD: Makefile,v 1.47 2023/12/15 06:03:00 jmatthew Exp $
 
 FS=		install${OSrev}.img
-FSSIZE=		1359872
+FSSIZE=		1425408
 CDROM=		install${OSrev}.iso
 
 MOUNT_POINT=	/mnt
Index: distrib/special/Makefile.inc
===================================================================
RCS file: /cvs/src/distrib/special/Makefile.inc,v
diff -u -p -u -r1.10 Makefile.inc
--- distrib/special/Makefile.inc	16 Apr 2023 19:57:01 -0000	1.10
+++ distrib/special/Makefile.inc	14 May 2024 15:07:17 -0000
@@ -5,6 +5,8 @@ COPTS+=-fno-unwind-tables -fno-asynchron
 
 .if ${MACHINE} == "amd64"
 COPTS+=-fcf-protection=none
+#COPTS+=-mllvm -x86-ret-clean=false
+COPTS+=-fno-ret-clean
 .endif
 .if ${MACHINE} == "arm64"
 COPTS+=-mbranch-protection=none