Index | Thread | Search

From:
Mark Kettenis <mark.kettenis@xs4all.nl>
Subject:
Re: Emulate CPU ID register access on arm64
To:
Jeremie Courreges-Anglas <jca@wxcvbn.org>
Cc:
patrick@openbsd.org, tech@openbsd.org, brad@comstyle.com
Date:
Wed, 17 Jul 2024 22:40:57 +0200

Download raw body.

Thread
> Date: Sun, 14 Jul 2024 12:52:36 +0200
> From: Jeremie Courreges-Anglas <jca@wxcvbn.org>

The code rearrange bit is in, so that leaves the emulation bits.

> On Sun, Jul 14, 2024 at 11:25:27AM +0200, Mark Kettenis wrote:
> > As mentioned in the HWCAP discussion, new arm64 bits are typically no
> > longer assigned and instead the HWCAP_CPUID bit is used to signal that
> > the kernel emulates access to the CPU ID registers.  The diff below
> > implements this.
> > 
> > The architecture is clearly designed to make emulattion possible and
> > easy.  There is a special trap for MSR access and it provides all the
> > detailt needed to emulate access without the need to read the
> > instruction.  The trapping from EL0 is probably done to let the OS
> > sanitize the values such that only features relevant to userland and
> > common to all CPU cores in the system are advertised.
> > 
> > There are some open questions though.  The diff is rather strict in
> > what access it emulates.  For now this is only the "known" ID_AA64_xxx
> > registers, that is the ID_AA64_xxx registers that are currently
> > defined by the architectures.  I think the architecture actually says
> > that the currently undefined ID_AA64_xxx registers should return zero,
> > and Linux allows access to them all.  But I would like to know when
> > userland processes actually start accessing those, so for now we'll
> > SIGILL.
> 
> ack
> 
> > I didn't implement emulation for the "32-bit" ID_xxx registers.  We
> > don't support excuting 32-bit processes so these registers should be
> > irrelevant.  But they can be accessed using arm64 instructions, so for
> > now we'll continue to SIGILL these as well.
> 
> No idea how often those registers might be used.
> 
> > Then ther are the MIDR_EL1 and MPIDR_EL1 registers.  Linux allows
> > access to these registers.  The problem with these is that their
> > values may depend on what CPU a process is executed on.  So in general
> > I don't think userland code should look at these.  However, some
> > userland code looks at these since certain optimizations might apply
> > only to specific CPU implementations.  On amd64, where the CPUID
> > instruction is available to userland, looking at the CPU familiy and
> > model bits is accepted practice.
> 
> Looks like killing processes for (at least) MIDR_EL1 access would lead
> to breakage in a bunch of ports.
> 
> For example there is devel/abseil-cpp and the copies of it bundled
> into other ports.  The detection is currently done through
> getauxval(3) but letting this SIGILL would be an inconvenient trap for
> anyone adding detection through elf_aux_info(3).

Ok.  So that does a microarchitectural optimization of the CRC
calculation.  Looks like it only does that for hardware that Google
cares about in production though.  So they avoid the issue with having
different cores integrated on a single SoC.

So I've implemented MIDR_EL1, MPIDR_EL1 and REVIDR_EL1 in the same way
as Linux, passing through MIDR)EL1 (the core type) and faking the
other registers.

> https://codesearch.debian.net/search?q=%26.*HWCAP_CPUID&literal=0
> 
> Granted, as long as we don't advertize HWCAP_CPUID your proposal is
> still more lenient/useful than our current policy of always have
> userland trap.
> 
> > Thoughts?
> 
> I won't address all your open questions, I'll just point out that the
> refactoring using cpu_identify_cleanup() makes sense on its own and
> would be useful to implement the rest of the HWCAP_* bits.  This part
> of the changes is ok jca@ FWIW, I don't feel particularly qualified to
> look at the MSR code/defines changes.
> 
> BTW you're introducing a static function there:

Yeah, this file already has those.

> > +static int
> > +emulate_msr(struct trapframe *frame, uint64_t esr)
> > +{

Index: arch/arm64/arm64/trap.c
===================================================================
RCS file: /cvs/src/sys/arch/arm64/arm64/trap.c,v
retrieving revision 1.48
diff -u -p -r1.48 trap.c
--- arch/arm64/arm64/trap.c	21 Feb 2024 15:53:07 -0000	1.48
+++ arch/arm64/arm64/trap.c	17 Jul 2024 20:35:56 -0000
@@ -187,6 +187,99 @@ kdata_abort(struct trapframe *frame, uin
 	}
 }
 
+static int
+emulate_msr(struct trapframe *frame, uint64_t esr)
+{
+	u_int rt = ISS_MSR_Rt(esr);
+	uint64_t val;
+
+	/* Only emulate reads. */
+	if ((esr & ISS_MSR_DIR) == 0)
+		return 0;
+
+	/* Only emulate non-debug System register access. */
+	if (ISS_MSR_OP0(esr) != 3 || ISS_MSR_OP1(esr) != 0 ||
+	    ISS_MSR_CRn(esr) != 0)
+		return 0;
+
+	switch (ISS_MSR_CRm(esr)) {
+	case 0:
+		switch (ISS_MSR_OP2(esr)) {
+		case 0:		/* MIDR_EL1 */
+			val = READ_SPECIALREG(midr_el1);
+			break;
+		case 5:		/* MPIDR_EL1 */
+			/*
+			 * Don't reveal the topology to userland.  But
+			 * return a valid value; Bit 31 is RES1.
+			 */
+			val = 0x80000000;
+			break;
+		case 6:		/* REVIDR_EL1 */
+			val = 0;
+			break;
+		default:
+			return 0;
+		}
+		break;
+	case 4:
+		switch (ISS_MSR_OP2(esr)) {
+		case 0:		/* ID_AA64PFR0_EL1 */
+			val = cpu_id_aa64pfr0;
+			break;
+		case 1:		/* ID_AA64PFR1_EL1 */
+			val = cpu_id_aa64pfr1;
+			break;
+		case 2:		/* ID_AA64PFR2_EL1 */
+		case 4:		/* ID_AA64ZFR0_EL1 */
+		case 5:		/* ID_AA64SMFR0_EL1 */
+			val = 0;
+			break;
+		default:
+			return 0;
+		}
+		break;
+	case 6:
+		switch (ISS_MSR_OP2(esr)) {
+		case 0:	/* ID_AA64ISAR0_EL1 */
+			val = cpu_id_aa64isar0;
+			break;
+		case 1: /* ID_AA64ISAR1_EL1 */
+			val = cpu_id_aa64isar1;
+			break;
+		case 2: /* ID_AA64ISAR2_EL2 */
+			val = cpu_id_aa64isar2;
+			break;
+		default:
+			return 0;
+		}
+		break;
+	case 7:
+		switch (ISS_MSR_OP2(esr)) {
+		case 0: /* ID_AA64MMFR0_EL1 */
+		case 1: /* ID_AA64MMFR1_EL1 */
+		case 2: /* ID_AA64MMFR2_EL1 */
+		case 3: /* ID_AA64MMFR3_EL1 */
+		case 4: /* ID_AA64MMFR4_EL1 */
+			val = 0;
+			break;
+		default:
+			return 0;
+		}
+		break;
+	default:
+		return 0;
+	}
+
+	if (rt < 30)
+		frame->tf_x[rt] = val;
+	else if (rt == 30)
+		frame->tf_lr = val;
+	frame->tf_elr += 4;
+
+	return 1;
+}
+
 void
 do_el1h_sync(struct trapframe *frame)
 {
@@ -288,6 +381,10 @@ do_el0_sync(struct trapframe *frame)
 		sv.sival_ptr = (void *)frame->tf_elr;
 		trapsignal(p, SIGILL, esr, ILL_BTCFI, sv);
 		break;
+	case EXCP_MSR:
+		if (emulate_msr(frame, esr))
+			break;
+		/* FALLTHROUGH */
 	case EXCP_FPAC:
 		curcpu()->ci_flush_bp();
 		sv.sival_ptr = (void *)frame->tf_elr;
Index: arch/arm64/include/armreg.h
===================================================================
RCS file: /cvs/src/sys/arch/arm64/include/armreg.h,v
retrieving revision 1.35
diff -u -p -r1.35 armreg.h
--- arch/arm64/include/armreg.h	23 Jun 2024 10:17:16 -0000	1.35
+++ arch/arm64/include/armreg.h	17 Jul 2024 20:35:56 -0000
@@ -171,6 +171,26 @@
 #define	 ISS_DATA_DFSC_ECC_L3	(0x1f << 0)
 #define	 ISS_DATA_DFSC_ALIGN	(0x21 << 0)
 #define	 ISS_DATA_DFSC_TLB_CONFLICT (0x30 << 0)
+#define	 ISS_MSR_DIR_SHIFT	0
+#define	 ISS_MSR_DIR		(0x01 << ISS_MSR_DIR_SHIFT)
+#define	 ISS_MSR_Rt_SHIFT	5
+#define	 ISS_MSR_Rt_MASK	(0x1f << ISS_MSR_Rt_SHIFT)
+#define	 ISS_MSR_Rt(x)		(((x) & ISS_MSR_Rt_MASK) >> ISS_MSR_Rt_SHIFT)
+#define	 ISS_MSR_CRm_SHIFT	1
+#define	 ISS_MSR_CRm_MASK	(0xf << ISS_MSR_CRm_SHIFT)
+#define	 ISS_MSR_CRm(x)		(((x) & ISS_MSR_CRm_MASK) >> ISS_MSR_CRm_SHIFT)
+#define	 ISS_MSR_CRn_SHIFT	10
+#define	 ISS_MSR_CRn_MASK	(0xf << ISS_MSR_CRn_SHIFT)
+#define	 ISS_MSR_CRn(x)		(((x) & ISS_MSR_CRn_MASK) >> ISS_MSR_CRn_SHIFT)
+#define	 ISS_MSR_OP1_SHIFT	14
+#define	 ISS_MSR_OP1_MASK	(0x7 << ISS_MSR_OP1_SHIFT)
+#define	 ISS_MSR_OP1(x)		(((x) & ISS_MSR_OP1_MASK) >> ISS_MSR_OP1_SHIFT)
+#define	 ISS_MSR_OP2_SHIFT	17
+#define	 ISS_MSR_OP2_MASK	(0x7 << ISS_MSR_OP2_SHIFT)
+#define	 ISS_MSR_OP2(x)		(((x) & ISS_MSR_OP2_MASK) >> ISS_MSR_OP2_SHIFT)
+#define	 ISS_MSR_OP0_SHIFT	20
+#define	 ISS_MSR_OP0_MASK	(0x3 << ISS_MSR_OP0_SHIFT)
+#define	 ISS_MSR_OP0(x)		(((x) & ISS_MSR_OP0_MASK) >> ISS_MSR_OP0_SHIFT)
 #define	ESR_ELx_IL		(0x01 << 25)
 #define	ESR_ELx_EC_SHIFT	26
 #define	ESR_ELx_EC_MASK		(0x3f << 26)