Download raw body.
Emulate CPU ID register access on arm64
> Date: Sun, 21 Jul 2024 21:41:47 +0200
> From: Jeremie Courreges-Anglas <jca@wxcvbn.org>
>
> On Wed, Jul 17, 2024 at 10:40:57PM +0200, Mark Kettenis wrote:
> > > Date: Sun, 14 Jul 2024 12:52:36 +0200
> > > From: Jeremie Courreges-Anglas <jca@wxcvbn.org>
> >
> > The code rearrange bit is in, so that leaves the emulation bits.
> >
> > > On Sun, Jul 14, 2024 at 11:25:27AM +0200, Mark Kettenis wrote:
> > > > As mentioned in the HWCAP discussion, new arm64 bits are typically no
> > > > longer assigned and instead the HWCAP_CPUID bit is used to signal that
> > > > the kernel emulates access to the CPU ID registers. The diff below
> > > > implements this.
> > > >
> > > > The architecture is clearly designed to make emulattion possible and
> > > > easy. There is a special trap for MSR access and it provides all the
> > > > detailt needed to emulate access without the need to read the
> > > > instruction. The trapping from EL0 is probably done to let the OS
> > > > sanitize the values such that only features relevant to userland and
> > > > common to all CPU cores in the system are advertised.
> > > >
> > > > There are some open questions though. The diff is rather strict in
> > > > what access it emulates. For now this is only the "known" ID_AA64_xxx
> > > > registers, that is the ID_AA64_xxx registers that are currently
> > > > defined by the architectures. I think the architecture actually says
> > > > that the currently undefined ID_AA64_xxx registers should return zero,
> > > > and Linux allows access to them all. But I would like to know when
> > > > userland processes actually start accessing those, so for now we'll
> > > > SIGILL.
> > >
> > > ack
> > >
> > > > I didn't implement emulation for the "32-bit" ID_xxx registers. We
> > > > don't support excuting 32-bit processes so these registers should be
> > > > irrelevant. But they can be accessed using arm64 instructions, so for
> > > > now we'll continue to SIGILL these as well.
> > >
> > > No idea how often those registers might be used.
> > >
> > > > Then ther are the MIDR_EL1 and MPIDR_EL1 registers. Linux allows
> > > > access to these registers. The problem with these is that their
> > > > values may depend on what CPU a process is executed on. So in general
> > > > I don't think userland code should look at these. However, some
> > > > userland code looks at these since certain optimizations might apply
> > > > only to specific CPU implementations. On amd64, where the CPUID
> > > > instruction is available to userland, looking at the CPU familiy and
> > > > model bits is accepted practice.
> > >
> > > Looks like killing processes for (at least) MIDR_EL1 access would lead
> > > to breakage in a bunch of ports.
> > >
> > > For example there is devel/abseil-cpp and the copies of it bundled
> > > into other ports. The detection is currently done through
> > > getauxval(3) but letting this SIGILL would be an inconvenient trap for
> > > anyone adding detection through elf_aux_info(3).
> >
> > Ok. So that does a microarchitectural optimization of the CRC
> > calculation. Looks like it only does that for hardware that Google
> > cares about in production though. So they avoid the issue with having
> > different cores integrated on a single SoC.
> >
> > So I've implemented MIDR_EL1, MPIDR_EL1 and REVIDR_EL1 in the same way
> > as Linux, passing through MIDR)EL1 (the core type) and faking the
> > other registers.
>
> Cool. The midr/revidr_el1 behavior matches what's documented at
> https://docs.kernel.org/arch/arm64/cpu-feature-registers.html
>
> > > https://codesearch.debian.net/search?q=%26.*HWCAP_CPUID&literal=0
> > >
> > > Granted, as long as we don't advertize HWCAP_CPUID your proposal is
> > > still more lenient/useful than our current policy of always have
> > > userland trap.
> > >
> > > > Thoughts?
>
> The emulation code looks good to me according to ARM DDI 0487K.a
> D22.3. I can't test this currently, ok jca@ FWIW.
As naddy@ found out, it actually isn't ok. The exception that helps
emulation of the MSR/MRS instructions is only available if FEAT_IDST
is implemented. That's an (mandatory) ARMv8.4 feature, so older CPU
cores don't support it. On those older CPUs the instructions generate
a standard "unknown instruction" trap.
We could implement support for emulating these instructions "the hard
way", by reading the instruction and decoding it. However, since
arm64 implements X-only, using copyin(9) to read the instruction
doesn't work and we haven't implemented copyinsn(9) on arm64. And
implementing copyinsn(9) isn't trivial on arm64.
Alternatively we can advertise HWCAP_CPUID only if FEAT_IDST is
implemented. That should be ok, since the older CPU cores shouldn't
have any features that aren't discoverable through AT_HWCAP and
AT_HWCAP2. Both Linux and FreeBSD do implement emulation "the hard
way". So there is a risk that code will skip checking the HWCAP_CPUID
before trying to read the ID registers. Such code will fail to run on
older CPUs. But such code would already be failing on what we have
now. So I propose we go with this solution.
One gotcha with this diff is that we now unconditionally read
ID_AA64MMFR2_EL1. This should be fine on real hardware (and return),
but old versions of QEMU (2.5.1 apparently) will trap. So without a
workaround, this diff will break OpenBSD on those old versions of
QEMU.
ok?
Index: arch/arm64/arm64/cpu.c
===================================================================
RCS file: /cvs/src/sys/arch/arm64/arm64/cpu.c,v
retrieving revision 1.129
diff -u -p -r1.129 cpu.c
--- arch/arm64/arm64/cpu.c 21 Jul 2024 18:57:31 -0000 1.129
+++ arch/arm64/arm64/cpu.c 22 Jul 2024 18:25:39 -0000
@@ -242,6 +242,9 @@ int cpu_node;
uint64_t cpu_id_aa64isar0;
uint64_t cpu_id_aa64isar1;
uint64_t cpu_id_aa64isar2;
+uint64_t cpu_id_aa64mmfr0;
+uint64_t cpu_id_aa64mmfr1;
+uint64_t cpu_id_aa64mmfr2;
uint64_t cpu_id_aa64pfr0;
uint64_t cpu_id_aa64pfr1;
@@ -487,6 +490,7 @@ cpu_identify(struct cpu_info *ci)
static uint64_t prev_id_aa64isar2;
static uint64_t prev_id_aa64mmfr0;
static uint64_t prev_id_aa64mmfr1;
+ static uint64_t prev_id_aa64mmfr2;
static uint64_t prev_id_aa64pfr0;
static uint64_t prev_id_aa64pfr1;
uint64_t midr, impl, part;
@@ -642,6 +646,7 @@ cpu_identify(struct cpu_info *ci)
READ_SPECIALREG(id_aa64isar2_el1) == prev_id_aa64isar2 &&
READ_SPECIALREG(id_aa64mmfr0_el1) == prev_id_aa64mmfr0 &&
READ_SPECIALREG(id_aa64mmfr1_el1) == prev_id_aa64mmfr1 &&
+ READ_SPECIALREG(id_aa64mmfr2_el1) == prev_id_aa64mmfr2 &&
READ_SPECIALREG(id_aa64pfr0_el1) == prev_id_aa64pfr0 &&
READ_SPECIALREG(id_aa64pfr1_el1) == prev_id_aa64pfr1)
return;
@@ -662,6 +667,18 @@ cpu_identify(struct cpu_info *ci)
printf("\n%s: mismatched ID_AA64ISAR2_EL1",
ci->ci_dev->dv_xname);
}
+ if (READ_SPECIALREG(id_aa64mmfr0_el1) != cpu_id_aa64mmfr0) {
+ printf("\n%s: mismatched ID_AA64MMFR0_EL1",
+ ci->ci_dev->dv_xname);
+ }
+ if (READ_SPECIALREG(id_aa64mmfr1_el1) != cpu_id_aa64mmfr1) {
+ printf("\n%s: mismatched ID_AA64MMFR1_EL1",
+ ci->ci_dev->dv_xname);
+ }
+ if (READ_SPECIALREG(id_aa64mmfr2_el1) != cpu_id_aa64mmfr2) {
+ printf("\n%s: mismatched ID_AA64MMFR2_EL1",
+ ci->ci_dev->dv_xname);
+ }
id = READ_SPECIALREG(id_aa64pfr0_el1);
/* Allow CSV2/CVS3 to be different. */
id &= ~ID_AA64PFR0_CSV2_MASK;
@@ -939,6 +956,16 @@ cpu_identify(struct cpu_info *ci)
}
/*
+ * ID_AA64MMFR2
+ */
+ id = READ_SPECIALREG(id_aa64mmfr2_el1);
+
+ if (ID_AA64MMFR2_IDS(id) >= ID_AA64MMFR2_IDS_IMPL) {
+ printf("%sIDS", sep);
+ sep = ",";
+ }
+
+ /*
* ID_AA64PFR0
*/
id = READ_SPECIALREG(id_aa64pfr0_el1);
@@ -989,6 +1016,7 @@ cpu_identify(struct cpu_info *ci)
prev_id_aa64isar2 = READ_SPECIALREG(id_aa64isar2_el1);
prev_id_aa64mmfr0 = READ_SPECIALREG(id_aa64mmfr0_el1);
prev_id_aa64mmfr1 = READ_SPECIALREG(id_aa64mmfr1_el1);
+ prev_id_aa64mmfr2 = READ_SPECIALREG(id_aa64mmfr2_el1);
prev_id_aa64pfr0 = READ_SPECIALREG(id_aa64pfr0_el1);
prev_id_aa64pfr1 = READ_SPECIALREG(id_aa64pfr1_el1);
@@ -1023,6 +1051,7 @@ cpu_identify(struct cpu_info *ci)
void
cpu_identify_cleanup(void)
{
+ uint64_t id_aa64mmfr2;
uint64_t value;
/* ID_AA64ISAR0_EL1 */
@@ -1040,6 +1069,15 @@ cpu_identify_cleanup(void)
value &= ~ID_AA64ISAR2_CLRBHB_MASK;
cpu_id_aa64isar2 = value;
+ /* ID_AA64MMFR0_EL1 */
+ cpu_id_aa64mmfr0 = 0;
+
+ /* ID_AA64MMFR1_EL1 */
+ cpu_id_aa64mmfr1 = 0;
+
+ /* ID_AA64MMFR2_EL1 */
+ cpu_id_aa64mmfr2 = 0;
+
/* ID_AA64PFR0_EL1 */
value = 0;
value |= cpu_id_aa64pfr0 & ID_AA64PFR0_FP_MASK;
@@ -1071,7 +1109,9 @@ cpu_identify_cleanup(void)
hwcap |= HWCAP_ATOMICS;
/* HWCAP_FPHP */
/* HWCAP_ASIMDHP */
- /* HWCAP_CPUID */
+ id_aa64mmfr2 = READ_SPECIALREG(id_aa64mmfr2_el1);
+ if (ID_AA64MMFR2_IDS(id_aa64mmfr2) >= ID_AA64MMFR2_IDS_IMPL)
+ hwcap |= HWCAP_CPUID;
if (ID_AA64ISAR0_RDM(cpu_id_aa64isar0) >= ID_AA64ISAR0_RDM_IMPL)
hwcap |= HWCAP_ASIMDRDM;
if (ID_AA64ISAR1_JSCVT(cpu_id_aa64isar1) >= ID_AA64ISAR1_JSCVT_IMPL)
@@ -1271,6 +1311,9 @@ cpu_attach(struct device *parent, struct
cpu_id_aa64isar0 = READ_SPECIALREG(id_aa64isar0_el1);
cpu_id_aa64isar1 = READ_SPECIALREG(id_aa64isar1_el1);
cpu_id_aa64isar2 = READ_SPECIALREG(id_aa64isar2_el1);
+ cpu_id_aa64mmfr0 = READ_SPECIALREG(id_aa64mmfr0_el1);
+ cpu_id_aa64mmfr1 = READ_SPECIALREG(id_aa64mmfr1_el1);
+ cpu_id_aa64mmfr2 = READ_SPECIALREG(id_aa64mmfr2_el1);
cpu_id_aa64pfr0 = READ_SPECIALREG(id_aa64pfr0_el1);
cpu_id_aa64pfr1 = READ_SPECIALREG(id_aa64pfr1_el1);
Index: arch/arm64/arm64/machdep.c
===================================================================
RCS file: /cvs/src/sys/arch/arm64/arm64/machdep.c,v
retrieving revision 1.91
diff -u -p -r1.91 machdep.c
--- arch/arm64/arm64/machdep.c 17 Jul 2024 15:21:59 -0000 1.91
+++ arch/arm64/arm64/machdep.c 22 Jul 2024 18:25:39 -0000
@@ -360,8 +360,11 @@ cpu_sysctl(int *name, u_int namelen, voi
case CPU_ID_AA64PFR1:
return sysctl_rdquad(oldp, oldlenp, newp, cpu_id_aa64pfr1);
case CPU_ID_AA64MMFR0:
+ return sysctl_rdquad(oldp, oldlenp, newp, cpu_id_aa64mmfr0);
case CPU_ID_AA64MMFR1:
+ return sysctl_rdquad(oldp, oldlenp, newp, cpu_id_aa64mmfr1);
case CPU_ID_AA64MMFR2:
+ return sysctl_rdquad(oldp, oldlenp, newp, cpu_id_aa64mmfr2);
case CPU_ID_AA64SMFR0:
case CPU_ID_AA64ZFR0:
return sysctl_rdquad(oldp, oldlenp, newp, 0);
Index: arch/arm64/arm64/trap.c
===================================================================
RCS file: /cvs/src/sys/arch/arm64/arm64/trap.c,v
retrieving revision 1.48
diff -u -p -r1.48 trap.c
--- arch/arm64/arm64/trap.c 21 Feb 2024 15:53:07 -0000 1.48
+++ arch/arm64/arm64/trap.c 22 Jul 2024 18:25:39 -0000
@@ -187,6 +187,99 @@ kdata_abort(struct trapframe *frame, uin
}
}
+static int
+emulate_msr(struct trapframe *frame, uint64_t esr)
+{
+ u_int rt = ISS_MSR_Rt(esr);
+ uint64_t val;
+
+ /* Only emulate reads. */
+ if ((esr & ISS_MSR_DIR) == 0)
+ return 0;
+
+ /* Only emulate non-debug System register access. */
+ if (ISS_MSR_OP0(esr) != 3 || ISS_MSR_OP1(esr) != 0 ||
+ ISS_MSR_CRn(esr) != 0)
+ return 0;
+
+ switch (ISS_MSR_CRm(esr)) {
+ case 0:
+ switch (ISS_MSR_OP2(esr)) {
+ case 0: /* MIDR_EL1 */
+ val = READ_SPECIALREG(midr_el1);
+ break;
+ case 5: /* MPIDR_EL1 */
+ /*
+ * Don't reveal the topology to userland. But
+ * return a valid value; Bit 31 is RES1.
+ */
+ val = 0x80000000;
+ break;
+ case 6: /* REVIDR_EL1 */
+ val = 0;
+ break;
+ default:
+ return 0;
+ }
+ break;
+ case 4:
+ switch (ISS_MSR_OP2(esr)) {
+ case 0: /* ID_AA64PFR0_EL1 */
+ val = cpu_id_aa64pfr0;
+ break;
+ case 1: /* ID_AA64PFR1_EL1 */
+ val = cpu_id_aa64pfr1;
+ break;
+ case 2: /* ID_AA64PFR2_EL1 */
+ case 4: /* ID_AA64ZFR0_EL1 */
+ case 5: /* ID_AA64SMFR0_EL1 */
+ val = 0;
+ break;
+ default:
+ return 0;
+ }
+ break;
+ case 6:
+ switch (ISS_MSR_OP2(esr)) {
+ case 0: /* ID_AA64ISAR0_EL1 */
+ val = cpu_id_aa64isar0;
+ break;
+ case 1: /* ID_AA64ISAR1_EL1 */
+ val = cpu_id_aa64isar1;
+ break;
+ case 2: /* ID_AA64ISAR2_EL2 */
+ val = cpu_id_aa64isar2;
+ break;
+ default:
+ return 0;
+ }
+ break;
+ case 7:
+ switch (ISS_MSR_OP2(esr)) {
+ case 0: /* ID_AA64MMFR0_EL1 */
+ case 1: /* ID_AA64MMFR1_EL1 */
+ case 2: /* ID_AA64MMFR2_EL1 */
+ case 3: /* ID_AA64MMFR3_EL1 */
+ case 4: /* ID_AA64MMFR4_EL1 */
+ val = 0;
+ break;
+ default:
+ return 0;
+ }
+ break;
+ default:
+ return 0;
+ }
+
+ if (rt < 30)
+ frame->tf_x[rt] = val;
+ else if (rt == 30)
+ frame->tf_lr = val;
+ frame->tf_elr += 4;
+
+ return 1;
+}
+
void
do_el1h_sync(struct trapframe *frame)
{
@@ -288,6 +381,10 @@ do_el0_sync(struct trapframe *frame)
sv.sival_ptr = (void *)frame->tf_elr;
trapsignal(p, SIGILL, esr, ILL_BTCFI, sv);
break;
+ case EXCP_MSR:
+ if (emulate_msr(frame, esr))
+ break;
+ /* FALLTHROUGH */
case EXCP_FPAC:
curcpu()->ci_flush_bp();
sv.sival_ptr = (void *)frame->tf_elr;
Index: arch/arm64/include/armreg.h
===================================================================
RCS file: /cvs/src/sys/arch/arm64/include/armreg.h,v
retrieving revision 1.35
diff -u -p -r1.35 armreg.h
--- arch/arm64/include/armreg.h 23 Jun 2024 10:17:16 -0000 1.35
+++ arch/arm64/include/armreg.h 22 Jul 2024 18:25:40 -0000
@@ -171,6 +171,26 @@
#define ISS_DATA_DFSC_ECC_L3 (0x1f << 0)
#define ISS_DATA_DFSC_ALIGN (0x21 << 0)
#define ISS_DATA_DFSC_TLB_CONFLICT (0x30 << 0)
+#define ISS_MSR_DIR_SHIFT 0
+#define ISS_MSR_DIR (0x01 << ISS_MSR_DIR_SHIFT)
+#define ISS_MSR_Rt_SHIFT 5
+#define ISS_MSR_Rt_MASK (0x1f << ISS_MSR_Rt_SHIFT)
+#define ISS_MSR_Rt(x) (((x) & ISS_MSR_Rt_MASK) >> ISS_MSR_Rt_SHIFT)
+#define ISS_MSR_CRm_SHIFT 1
+#define ISS_MSR_CRm_MASK (0xf << ISS_MSR_CRm_SHIFT)
+#define ISS_MSR_CRm(x) (((x) & ISS_MSR_CRm_MASK) >> ISS_MSR_CRm_SHIFT)
+#define ISS_MSR_CRn_SHIFT 10
+#define ISS_MSR_CRn_MASK (0xf << ISS_MSR_CRn_SHIFT)
+#define ISS_MSR_CRn(x) (((x) & ISS_MSR_CRn_MASK) >> ISS_MSR_CRn_SHIFT)
+#define ISS_MSR_OP1_SHIFT 14
+#define ISS_MSR_OP1_MASK (0x7 << ISS_MSR_OP1_SHIFT)
+#define ISS_MSR_OP1(x) (((x) & ISS_MSR_OP1_MASK) >> ISS_MSR_OP1_SHIFT)
+#define ISS_MSR_OP2_SHIFT 17
+#define ISS_MSR_OP2_MASK (0x7 << ISS_MSR_OP2_SHIFT)
+#define ISS_MSR_OP2(x) (((x) & ISS_MSR_OP2_MASK) >> ISS_MSR_OP2_SHIFT)
+#define ISS_MSR_OP0_SHIFT 20
+#define ISS_MSR_OP0_MASK (0x3 << ISS_MSR_OP0_SHIFT)
+#define ISS_MSR_OP0(x) (((x) & ISS_MSR_OP0_MASK) >> ISS_MSR_OP0_SHIFT)
#define ESR_ELx_IL (0x01 << 25)
#define ESR_ELx_EC_SHIFT 26
#define ESR_ELx_EC_MASK (0x3f << 26)
@@ -537,6 +557,10 @@
#define ID_AA64MMFR2_CCIDX_MASK (0xfULL << ID_AA64MMFR2_CCIDX_SHIFT)
#define ID_AA64MMFR2_CCIDX(x) ((x) & ID_AA64MMFR2_CCIDX_MASK)
#define ID_AA64MMFR2_CCIDX_IMPL (0x1ULL << ID_AA64MMFR2_CCIDX_SHIFT)
+#define ID_AA64MMFR2_IDS_SHIFT 36
+#define ID_AA64MMFR2_IDS_MASK (0xfULL << ID_AA64MMFR2_IDS_SHIFT)
+#define ID_AA64MMFR2_IDS(x) ((x) & ID_AA64MMFR2_IDS_MASK)
+#define ID_AA64MMFR2_IDS_IMPL (0x1ULL << ID_AA64MMFR2_IDS_SHIFT)
/* ID_AA64PFR0_EL1 */
#define ID_AA64PFR0_MASK 0xff0fffffffffffffULL
Index: arch/arm64/include/cpu.h
===================================================================
RCS file: /cvs/src/sys/arch/arm64/include/cpu.h,v
retrieving revision 1.49
diff -u -p -r1.49 cpu.h
--- arch/arm64/include/cpu.h 17 Jul 2024 15:21:59 -0000 1.49
+++ arch/arm64/include/cpu.h 22 Jul 2024 18:25:40 -0000
@@ -64,6 +64,9 @@
extern uint64_t cpu_id_aa64isar0;
extern uint64_t cpu_id_aa64isar1;
extern uint64_t cpu_id_aa64isar2;
+extern uint64_t cpu_id_aa64mmfr0;
+extern uint64_t cpu_id_aa64mmfr1;
+extern uint64_t cpu_id_aa64mmfr2;
extern uint64_t cpu_id_aa64pfr0;
extern uint64_t cpu_id_aa64pfr1;
Emulate CPU ID register access on arm64