Download raw body.
Allow vmm guests to save/restore pkru via xsave region
We're currently letting a guest see the PKU bit via cpuid(0x7, 0) but
then not exposing the XSAVE area bit for PKRU via cpuid(0xd, 0).
This is fine for OpenBSD guests as we don't use XSAVE/XRSTOR to
save/load PKRU state on context switches given how we use PKU today for
x-only on Intel. (We explicitly use RDPKRU/WRPKRU.)
Newer Linux guests see PKU support and then freak out when the support
for XSAVE doesn't exist. This puts them into a very "fun" state where
they don't panic but instead thrash about like a lunatic. It's sad,
really. See [1] for details.
This diff exposes the cpuid bit for the PKRU xsave area if the host has
enabled PKU mode. It also adds handling to let the guest set the xcr0
bit to let XSAVE/XRSTOR work for the PKRU state.
It does not change host PKU or XSAVE behavior.
The cpuid emulation change is a bit hacky in reporting xsave area max
size and current size, but since the PKRU state comes after all the
AVX512 stuff it's basically just needing to use the size of `struct
savefpu` plus 64 bits (32 for PKRU, 32 for padding/alignment).
ok? feedback?
[1] https://marc.info/?l=openbsd-misc&m=175505530816494&w=2
diff refs/heads/ugh refs/heads/vmm-xsave-pku
commit - 86e813bbfc617cf34c990962d64537d38e459d3d
commit + 2ed79f99ed9cb7ebd79a067c80332dc033fb6ffb
blob - bd1909f33280ca49fa1c546c313f54ac3a52d9b1
blob + 5021bdd6aaeb50d33dca12986277628ec75f6b14
--- sys/arch/amd64/amd64/vmm_machdep.c
+++ sys/arch/amd64/amd64/vmm_machdep.c
@@ -5927,7 +5927,7 @@ svm_handle_xsetbv(struct vcpu *vcpu)
int
vmm_handle_xsetbv(struct vcpu *vcpu, uint64_t *rax)
{
- uint64_t *rdx, *rcx, val;
+ uint64_t *rdx, *rcx, val, mask = xsave_mask;
rcx = &vcpu->vc_gueststate.vg_rcx;
rdx = &vcpu->vc_gueststate.vg_rdx;
@@ -5943,8 +5943,12 @@ vmm_handle_xsetbv(struct vcpu *vcpu, uint64_t *rax)
return (vmm_inject_gp(vcpu));
}
+ /* If we're exposing PKRU features, allow guests to set PKRU in xcr0. */
+ if (vmm_softc->sc_md.pkru_enabled)
+ mask |= XFEATURE_PKRU;
+
val = *rax + (*rdx << 32);
- if (val & ~xsave_mask) {
+ if (val & ~mask) {
DPRINTF("%s: guest specified xcr0 outside xsave_mask %lld\n",
__func__, val);
return (vmm_inject_gp(vcpu));
@@ -6187,6 +6191,22 @@ vmm_handle_cpuid_0xd(struct vcpu *vcpu, uint32_t suble
}
eax = xsave_mask & XFEATURE_XCR0_MASK;
edx = (xsave_mask & XFEATURE_XCR0_MASK) >> 32;
+
+ /*
+ * We don't currently use XSAVE to store/restore PKRU,
+ * but some guests may expect to do so. If PKE is
+ * supported, the PKRU feature bit should be 1.
+ *
+ * This also means adjusting the reported sizes of the
+ * XSAVE area as it requires an additional 32 bits, but
+ * also needs to be 64-bit aligned.
+ */
+ if (vmm_softc->sc_md.pkru_enabled) {
+ eax |= XFEATURE_PKRU;
+ ecx = sizeof(struct savefpu) + 64;
+ if (xcr0 & XFEATURE_PKRU)
+ ebx = ecx;
+ }
} else if (subleaf == 1) {
/* mask out XSAVEC, XSAVES, and XFD support */
eax &= XSAVE_XSAVEOPT | XSAVE_XGETBV1;
Allow vmm guests to save/restore pkru via xsave region