From: Mark Kettenis Subject: Re: Emulate CPU ID register access on arm64 To: Jeremie Courreges-Anglas Cc: patrick@openbsd.org, tech@openbsd.org, brad@comstyle.com, naddy@openbsd.org Date: Mon, 22 Jul 2024 20:30:58 +0200 > Date: Sun, 21 Jul 2024 21:41:47 +0200 > From: Jeremie Courreges-Anglas > > On Wed, Jul 17, 2024 at 10:40:57PM +0200, Mark Kettenis wrote: > > > Date: Sun, 14 Jul 2024 12:52:36 +0200 > > > From: Jeremie Courreges-Anglas > > > > The code rearrange bit is in, so that leaves the emulation bits. > > > > > On Sun, Jul 14, 2024 at 11:25:27AM +0200, Mark Kettenis wrote: > > > > As mentioned in the HWCAP discussion, new arm64 bits are typically no > > > > longer assigned and instead the HWCAP_CPUID bit is used to signal that > > > > the kernel emulates access to the CPU ID registers. The diff below > > > > implements this. > > > > > > > > The architecture is clearly designed to make emulattion possible and > > > > easy. There is a special trap for MSR access and it provides all the > > > > detailt needed to emulate access without the need to read the > > > > instruction. The trapping from EL0 is probably done to let the OS > > > > sanitize the values such that only features relevant to userland and > > > > common to all CPU cores in the system are advertised. > > > > > > > > There are some open questions though. The diff is rather strict in > > > > what access it emulates. For now this is only the "known" ID_AA64_xxx > > > > registers, that is the ID_AA64_xxx registers that are currently > > > > defined by the architectures. I think the architecture actually says > > > > that the currently undefined ID_AA64_xxx registers should return zero, > > > > and Linux allows access to them all. But I would like to know when > > > > userland processes actually start accessing those, so for now we'll > > > > SIGILL. > > > > > > ack > > > > > > > I didn't implement emulation for the "32-bit" ID_xxx registers. We > > > > don't support excuting 32-bit processes so these registers should be > > > > irrelevant. But they can be accessed using arm64 instructions, so for > > > > now we'll continue to SIGILL these as well. > > > > > > No idea how often those registers might be used. > > > > > > > Then ther are the MIDR_EL1 and MPIDR_EL1 registers. Linux allows > > > > access to these registers. The problem with these is that their > > > > values may depend on what CPU a process is executed on. So in general > > > > I don't think userland code should look at these. However, some > > > > userland code looks at these since certain optimizations might apply > > > > only to specific CPU implementations. On amd64, where the CPUID > > > > instruction is available to userland, looking at the CPU familiy and > > > > model bits is accepted practice. > > > > > > Looks like killing processes for (at least) MIDR_EL1 access would lead > > > to breakage in a bunch of ports. > > > > > > For example there is devel/abseil-cpp and the copies of it bundled > > > into other ports. The detection is currently done through > > > getauxval(3) but letting this SIGILL would be an inconvenient trap for > > > anyone adding detection through elf_aux_info(3). > > > > Ok. So that does a microarchitectural optimization of the CRC > > calculation. Looks like it only does that for hardware that Google > > cares about in production though. So they avoid the issue with having > > different cores integrated on a single SoC. > > > > So I've implemented MIDR_EL1, MPIDR_EL1 and REVIDR_EL1 in the same way > > as Linux, passing through MIDR)EL1 (the core type) and faking the > > other registers. > > Cool. The midr/revidr_el1 behavior matches what's documented at > https://docs.kernel.org/arch/arm64/cpu-feature-registers.html > > > > https://codesearch.debian.net/search?q=%26.*HWCAP_CPUID&literal=0 > > > > > > Granted, as long as we don't advertize HWCAP_CPUID your proposal is > > > still more lenient/useful than our current policy of always have > > > userland trap. > > > > > > > Thoughts? > > The emulation code looks good to me according to ARM DDI 0487K.a > D22.3. I can't test this currently, ok jca@ FWIW. As naddy@ found out, it actually isn't ok. The exception that helps emulation of the MSR/MRS instructions is only available if FEAT_IDST is implemented. That's an (mandatory) ARMv8.4 feature, so older CPU cores don't support it. On those older CPUs the instructions generate a standard "unknown instruction" trap. We could implement support for emulating these instructions "the hard way", by reading the instruction and decoding it. However, since arm64 implements X-only, using copyin(9) to read the instruction doesn't work and we haven't implemented copyinsn(9) on arm64. And implementing copyinsn(9) isn't trivial on arm64. Alternatively we can advertise HWCAP_CPUID only if FEAT_IDST is implemented. That should be ok, since the older CPU cores shouldn't have any features that aren't discoverable through AT_HWCAP and AT_HWCAP2. Both Linux and FreeBSD do implement emulation "the hard way". So there is a risk that code will skip checking the HWCAP_CPUID before trying to read the ID registers. Such code will fail to run on older CPUs. But such code would already be failing on what we have now. So I propose we go with this solution. One gotcha with this diff is that we now unconditionally read ID_AA64MMFR2_EL1. This should be fine on real hardware (and return), but old versions of QEMU (2.5.1 apparently) will trap. So without a workaround, this diff will break OpenBSD on those old versions of QEMU. ok? Index: arch/arm64/arm64/cpu.c =================================================================== RCS file: /cvs/src/sys/arch/arm64/arm64/cpu.c,v retrieving revision 1.129 diff -u -p -r1.129 cpu.c --- arch/arm64/arm64/cpu.c 21 Jul 2024 18:57:31 -0000 1.129 +++ arch/arm64/arm64/cpu.c 22 Jul 2024 18:25:39 -0000 @@ -242,6 +242,9 @@ int cpu_node; uint64_t cpu_id_aa64isar0; uint64_t cpu_id_aa64isar1; uint64_t cpu_id_aa64isar2; +uint64_t cpu_id_aa64mmfr0; +uint64_t cpu_id_aa64mmfr1; +uint64_t cpu_id_aa64mmfr2; uint64_t cpu_id_aa64pfr0; uint64_t cpu_id_aa64pfr1; @@ -487,6 +490,7 @@ cpu_identify(struct cpu_info *ci) static uint64_t prev_id_aa64isar2; static uint64_t prev_id_aa64mmfr0; static uint64_t prev_id_aa64mmfr1; + static uint64_t prev_id_aa64mmfr2; static uint64_t prev_id_aa64pfr0; static uint64_t prev_id_aa64pfr1; uint64_t midr, impl, part; @@ -642,6 +646,7 @@ cpu_identify(struct cpu_info *ci) READ_SPECIALREG(id_aa64isar2_el1) == prev_id_aa64isar2 && READ_SPECIALREG(id_aa64mmfr0_el1) == prev_id_aa64mmfr0 && READ_SPECIALREG(id_aa64mmfr1_el1) == prev_id_aa64mmfr1 && + READ_SPECIALREG(id_aa64mmfr2_el1) == prev_id_aa64mmfr2 && READ_SPECIALREG(id_aa64pfr0_el1) == prev_id_aa64pfr0 && READ_SPECIALREG(id_aa64pfr1_el1) == prev_id_aa64pfr1) return; @@ -662,6 +667,18 @@ cpu_identify(struct cpu_info *ci) printf("\n%s: mismatched ID_AA64ISAR2_EL1", ci->ci_dev->dv_xname); } + if (READ_SPECIALREG(id_aa64mmfr0_el1) != cpu_id_aa64mmfr0) { + printf("\n%s: mismatched ID_AA64MMFR0_EL1", + ci->ci_dev->dv_xname); + } + if (READ_SPECIALREG(id_aa64mmfr1_el1) != cpu_id_aa64mmfr1) { + printf("\n%s: mismatched ID_AA64MMFR1_EL1", + ci->ci_dev->dv_xname); + } + if (READ_SPECIALREG(id_aa64mmfr2_el1) != cpu_id_aa64mmfr2) { + printf("\n%s: mismatched ID_AA64MMFR2_EL1", + ci->ci_dev->dv_xname); + } id = READ_SPECIALREG(id_aa64pfr0_el1); /* Allow CSV2/CVS3 to be different. */ id &= ~ID_AA64PFR0_CSV2_MASK; @@ -939,6 +956,16 @@ cpu_identify(struct cpu_info *ci) } /* + * ID_AA64MMFR2 + */ + id = READ_SPECIALREG(id_aa64mmfr2_el1); + + if (ID_AA64MMFR2_IDS(id) >= ID_AA64MMFR2_IDS_IMPL) { + printf("%sIDS", sep); + sep = ","; + } + + /* * ID_AA64PFR0 */ id = READ_SPECIALREG(id_aa64pfr0_el1); @@ -989,6 +1016,7 @@ cpu_identify(struct cpu_info *ci) prev_id_aa64isar2 = READ_SPECIALREG(id_aa64isar2_el1); prev_id_aa64mmfr0 = READ_SPECIALREG(id_aa64mmfr0_el1); prev_id_aa64mmfr1 = READ_SPECIALREG(id_aa64mmfr1_el1); + prev_id_aa64mmfr2 = READ_SPECIALREG(id_aa64mmfr2_el1); prev_id_aa64pfr0 = READ_SPECIALREG(id_aa64pfr0_el1); prev_id_aa64pfr1 = READ_SPECIALREG(id_aa64pfr1_el1); @@ -1023,6 +1051,7 @@ cpu_identify(struct cpu_info *ci) void cpu_identify_cleanup(void) { + uint64_t id_aa64mmfr2; uint64_t value; /* ID_AA64ISAR0_EL1 */ @@ -1040,6 +1069,15 @@ cpu_identify_cleanup(void) value &= ~ID_AA64ISAR2_CLRBHB_MASK; cpu_id_aa64isar2 = value; + /* ID_AA64MMFR0_EL1 */ + cpu_id_aa64mmfr0 = 0; + + /* ID_AA64MMFR1_EL1 */ + cpu_id_aa64mmfr1 = 0; + + /* ID_AA64MMFR2_EL1 */ + cpu_id_aa64mmfr2 = 0; + /* ID_AA64PFR0_EL1 */ value = 0; value |= cpu_id_aa64pfr0 & ID_AA64PFR0_FP_MASK; @@ -1071,7 +1109,9 @@ cpu_identify_cleanup(void) hwcap |= HWCAP_ATOMICS; /* HWCAP_FPHP */ /* HWCAP_ASIMDHP */ - /* HWCAP_CPUID */ + id_aa64mmfr2 = READ_SPECIALREG(id_aa64mmfr2_el1); + if (ID_AA64MMFR2_IDS(id_aa64mmfr2) >= ID_AA64MMFR2_IDS_IMPL) + hwcap |= HWCAP_CPUID; if (ID_AA64ISAR0_RDM(cpu_id_aa64isar0) >= ID_AA64ISAR0_RDM_IMPL) hwcap |= HWCAP_ASIMDRDM; if (ID_AA64ISAR1_JSCVT(cpu_id_aa64isar1) >= ID_AA64ISAR1_JSCVT_IMPL) @@ -1271,6 +1311,9 @@ cpu_attach(struct device *parent, struct cpu_id_aa64isar0 = READ_SPECIALREG(id_aa64isar0_el1); cpu_id_aa64isar1 = READ_SPECIALREG(id_aa64isar1_el1); cpu_id_aa64isar2 = READ_SPECIALREG(id_aa64isar2_el1); + cpu_id_aa64mmfr0 = READ_SPECIALREG(id_aa64mmfr0_el1); + cpu_id_aa64mmfr1 = READ_SPECIALREG(id_aa64mmfr1_el1); + cpu_id_aa64mmfr2 = READ_SPECIALREG(id_aa64mmfr2_el1); cpu_id_aa64pfr0 = READ_SPECIALREG(id_aa64pfr0_el1); cpu_id_aa64pfr1 = READ_SPECIALREG(id_aa64pfr1_el1); Index: arch/arm64/arm64/machdep.c =================================================================== RCS file: /cvs/src/sys/arch/arm64/arm64/machdep.c,v retrieving revision 1.91 diff -u -p -r1.91 machdep.c --- arch/arm64/arm64/machdep.c 17 Jul 2024 15:21:59 -0000 1.91 +++ arch/arm64/arm64/machdep.c 22 Jul 2024 18:25:39 -0000 @@ -360,8 +360,11 @@ cpu_sysctl(int *name, u_int namelen, voi case CPU_ID_AA64PFR1: return sysctl_rdquad(oldp, oldlenp, newp, cpu_id_aa64pfr1); case CPU_ID_AA64MMFR0: + return sysctl_rdquad(oldp, oldlenp, newp, cpu_id_aa64mmfr0); case CPU_ID_AA64MMFR1: + return sysctl_rdquad(oldp, oldlenp, newp, cpu_id_aa64mmfr1); case CPU_ID_AA64MMFR2: + return sysctl_rdquad(oldp, oldlenp, newp, cpu_id_aa64mmfr2); case CPU_ID_AA64SMFR0: case CPU_ID_AA64ZFR0: return sysctl_rdquad(oldp, oldlenp, newp, 0); Index: arch/arm64/arm64/trap.c =================================================================== RCS file: /cvs/src/sys/arch/arm64/arm64/trap.c,v retrieving revision 1.48 diff -u -p -r1.48 trap.c --- arch/arm64/arm64/trap.c 21 Feb 2024 15:53:07 -0000 1.48 +++ arch/arm64/arm64/trap.c 22 Jul 2024 18:25:39 -0000 @@ -187,6 +187,99 @@ kdata_abort(struct trapframe *frame, uin } } +static int +emulate_msr(struct trapframe *frame, uint64_t esr) +{ + u_int rt = ISS_MSR_Rt(esr); + uint64_t val; + + /* Only emulate reads. */ + if ((esr & ISS_MSR_DIR) == 0) + return 0; + + /* Only emulate non-debug System register access. */ + if (ISS_MSR_OP0(esr) != 3 || ISS_MSR_OP1(esr) != 0 || + ISS_MSR_CRn(esr) != 0) + return 0; + + switch (ISS_MSR_CRm(esr)) { + case 0: + switch (ISS_MSR_OP2(esr)) { + case 0: /* MIDR_EL1 */ + val = READ_SPECIALREG(midr_el1); + break; + case 5: /* MPIDR_EL1 */ + /* + * Don't reveal the topology to userland. But + * return a valid value; Bit 31 is RES1. + */ + val = 0x80000000; + break; + case 6: /* REVIDR_EL1 */ + val = 0; + break; + default: + return 0; + } + break; + case 4: + switch (ISS_MSR_OP2(esr)) { + case 0: /* ID_AA64PFR0_EL1 */ + val = cpu_id_aa64pfr0; + break; + case 1: /* ID_AA64PFR1_EL1 */ + val = cpu_id_aa64pfr1; + break; + case 2: /* ID_AA64PFR2_EL1 */ + case 4: /* ID_AA64ZFR0_EL1 */ + case 5: /* ID_AA64SMFR0_EL1 */ + val = 0; + break; + default: + return 0; + } + break; + case 6: + switch (ISS_MSR_OP2(esr)) { + case 0: /* ID_AA64ISAR0_EL1 */ + val = cpu_id_aa64isar0; + break; + case 1: /* ID_AA64ISAR1_EL1 */ + val = cpu_id_aa64isar1; + break; + case 2: /* ID_AA64ISAR2_EL2 */ + val = cpu_id_aa64isar2; + break; + default: + return 0; + } + break; + case 7: + switch (ISS_MSR_OP2(esr)) { + case 0: /* ID_AA64MMFR0_EL1 */ + case 1: /* ID_AA64MMFR1_EL1 */ + case 2: /* ID_AA64MMFR2_EL1 */ + case 3: /* ID_AA64MMFR3_EL1 */ + case 4: /* ID_AA64MMFR4_EL1 */ + val = 0; + break; + default: + return 0; + } + break; + default: + return 0; + } + + if (rt < 30) + frame->tf_x[rt] = val; + else if (rt == 30) + frame->tf_lr = val; + frame->tf_elr += 4; + + return 1; +} + void do_el1h_sync(struct trapframe *frame) { @@ -288,6 +381,10 @@ do_el0_sync(struct trapframe *frame) sv.sival_ptr = (void *)frame->tf_elr; trapsignal(p, SIGILL, esr, ILL_BTCFI, sv); break; + case EXCP_MSR: + if (emulate_msr(frame, esr)) + break; + /* FALLTHROUGH */ case EXCP_FPAC: curcpu()->ci_flush_bp(); sv.sival_ptr = (void *)frame->tf_elr; Index: arch/arm64/include/armreg.h =================================================================== RCS file: /cvs/src/sys/arch/arm64/include/armreg.h,v retrieving revision 1.35 diff -u -p -r1.35 armreg.h --- arch/arm64/include/armreg.h 23 Jun 2024 10:17:16 -0000 1.35 +++ arch/arm64/include/armreg.h 22 Jul 2024 18:25:40 -0000 @@ -171,6 +171,26 @@ #define ISS_DATA_DFSC_ECC_L3 (0x1f << 0) #define ISS_DATA_DFSC_ALIGN (0x21 << 0) #define ISS_DATA_DFSC_TLB_CONFLICT (0x30 << 0) +#define ISS_MSR_DIR_SHIFT 0 +#define ISS_MSR_DIR (0x01 << ISS_MSR_DIR_SHIFT) +#define ISS_MSR_Rt_SHIFT 5 +#define ISS_MSR_Rt_MASK (0x1f << ISS_MSR_Rt_SHIFT) +#define ISS_MSR_Rt(x) (((x) & ISS_MSR_Rt_MASK) >> ISS_MSR_Rt_SHIFT) +#define ISS_MSR_CRm_SHIFT 1 +#define ISS_MSR_CRm_MASK (0xf << ISS_MSR_CRm_SHIFT) +#define ISS_MSR_CRm(x) (((x) & ISS_MSR_CRm_MASK) >> ISS_MSR_CRm_SHIFT) +#define ISS_MSR_CRn_SHIFT 10 +#define ISS_MSR_CRn_MASK (0xf << ISS_MSR_CRn_SHIFT) +#define ISS_MSR_CRn(x) (((x) & ISS_MSR_CRn_MASK) >> ISS_MSR_CRn_SHIFT) +#define ISS_MSR_OP1_SHIFT 14 +#define ISS_MSR_OP1_MASK (0x7 << ISS_MSR_OP1_SHIFT) +#define ISS_MSR_OP1(x) (((x) & ISS_MSR_OP1_MASK) >> ISS_MSR_OP1_SHIFT) +#define ISS_MSR_OP2_SHIFT 17 +#define ISS_MSR_OP2_MASK (0x7 << ISS_MSR_OP2_SHIFT) +#define ISS_MSR_OP2(x) (((x) & ISS_MSR_OP2_MASK) >> ISS_MSR_OP2_SHIFT) +#define ISS_MSR_OP0_SHIFT 20 +#define ISS_MSR_OP0_MASK (0x3 << ISS_MSR_OP0_SHIFT) +#define ISS_MSR_OP0(x) (((x) & ISS_MSR_OP0_MASK) >> ISS_MSR_OP0_SHIFT) #define ESR_ELx_IL (0x01 << 25) #define ESR_ELx_EC_SHIFT 26 #define ESR_ELx_EC_MASK (0x3f << 26) @@ -537,6 +557,10 @@ #define ID_AA64MMFR2_CCIDX_MASK (0xfULL << ID_AA64MMFR2_CCIDX_SHIFT) #define ID_AA64MMFR2_CCIDX(x) ((x) & ID_AA64MMFR2_CCIDX_MASK) #define ID_AA64MMFR2_CCIDX_IMPL (0x1ULL << ID_AA64MMFR2_CCIDX_SHIFT) +#define ID_AA64MMFR2_IDS_SHIFT 36 +#define ID_AA64MMFR2_IDS_MASK (0xfULL << ID_AA64MMFR2_IDS_SHIFT) +#define ID_AA64MMFR2_IDS(x) ((x) & ID_AA64MMFR2_IDS_MASK) +#define ID_AA64MMFR2_IDS_IMPL (0x1ULL << ID_AA64MMFR2_IDS_SHIFT) /* ID_AA64PFR0_EL1 */ #define ID_AA64PFR0_MASK 0xff0fffffffffffffULL Index: arch/arm64/include/cpu.h =================================================================== RCS file: /cvs/src/sys/arch/arm64/include/cpu.h,v retrieving revision 1.49 diff -u -p -r1.49 cpu.h --- arch/arm64/include/cpu.h 17 Jul 2024 15:21:59 -0000 1.49 +++ arch/arm64/include/cpu.h 22 Jul 2024 18:25:40 -0000 @@ -64,6 +64,9 @@ extern uint64_t cpu_id_aa64isar0; extern uint64_t cpu_id_aa64isar1; extern uint64_t cpu_id_aa64isar2; +extern uint64_t cpu_id_aa64mmfr0; +extern uint64_t cpu_id_aa64mmfr1; +extern uint64_t cpu_id_aa64mmfr2; extern uint64_t cpu_id_aa64pfr0; extern uint64_t cpu_id_aa64pfr1;