Index | Thread | Search

From:
Greg Schaefer <gsgs7878@proton.me>
Subject:
Silent crash when boot-CPU not first in X2APIC
To:
"tech@openbsd.org" <tech@openbsd.org>
Date:
Mon, 29 Dec 2025 04:56:04 +0000

Download raw body.

Thread
  • Greg Schaefer:

    Silent crash when boot-CPU not first in X2APIC

Purchased ASRock Industrial NUC-125H R2 (Intel Meteor Lake Ultra 5 125H
4 P-Core 6 E-Core 2 LP-Core) to upgrade edge server. First for me
OpenBSD failing to install-boot on Intel hardware. Last message is
unrelated video/DRM (misidentified in other reports).

Confirmed 7.7/7.8 (but confident any version) silent crashes if the boot
processor is not first X2APIC entry. Recent Linux boots fine and logs
the underlying issue. Restricting the NUC-125H to 2 P-Cores via BIOS
shifts boot-processor to X2ACPI[0] and OpenBSD boots. Guessing the BIOS
started life with 2 P-Core variants and something went wrong. That said,
in our world of hybrid processors, expect this becomes more common. Very
least, OpenBSD should log a critical before crashing.

Modifying acpi/acpimadt.c:acpimadt_attach to "look forward" if
ACPI_MADT_X2APIC[0] != BP fixes the issue. Given most of acpimadt.c is
almost 20 years old, completely understand any reluctance to modify. 
(Might be possible to modify the cpu_attach table post-parsing, but felt
more invasive and did not try.) My diff is below for reference. (Factored
out duplicated code into acpimadt_attach_prep to avoid triplicate.)

Thanks.

diff /usr/src/sys/dev/acpi/acpimadt.c
58,83d57
< void
< acpimadt_attach_prep(struct cpu_attach_args *caa, int role, int apic_id, int acpi_proc_uid)
< {
< 	memset(caa, 0, sizeof(*caa));
< 	caa->cpu_role = role;
< 	caa->caa_name = "cpu";
< 	caa->cpu_apicid = apic_id;
< 	caa->cpu_acpi_proc_id = acpi_proc_uid;
< #ifdef MULTIPROCESSOR
< 	caa->cpu_func = &mp_cpu_funcs;
< #endif
< #ifdef __i386__
< 	/*
< 	 * XXX utterly wrong.  These are the
< 	 * cpu_feature/cpu_id from the BSP cpu, now
< 	 * being given to another cpu.  This is
< 	 * bullshit.
< 	 */
< 	extern int cpu_id, cpu_feature;
< 	caa->cpu_signature = cpu_id;
< 	caa->feature_flags = cpu_feature;
< #endif
< 	if (role == CPU_ROLE_AP)
< 		ncpusfound++;
< }
<
228d201
< 	int role;
254d226
<
256,257d227
< 	int x2apic_cpus = 0;
< 	int self_id = lapic_cpu_number();
284,286c254,278
< 			role = (self_id == entry->madt_x2apic.apic_id) ? CPU_ROLE_BP : CPU_ROLE_AP;
< 			acpimadt_attach_prep(&caa, role, entry->madt_x2apic.apic_id,
< 			    entry->madt_x2apic.acpi_proc_uid);
---
> 			memset(&caa, 0, sizeof(struct cpu_attach_args));
> 			if (lapic_cpu_number() == entry->madt_lapic.apic_id)
> 				caa.cpu_role = CPU_ROLE_BP;
> 			else {
> 				caa.cpu_role = CPU_ROLE_AP;
> 				ncpusfound++;
> 			}
> 			caa.caa_name = "cpu";
> 			caa.cpu_apicid = entry->madt_lapic.apic_id;
> 			caa.cpu_acpi_proc_id = entry->madt_lapic.acpi_proc_id;
> #ifdef MULTIPROCESSOR
> 			caa.cpu_func = &mp_cpu_funcs;
> #endif
> #ifdef __i386__
> 			/*
> 			 * XXX utterly wrong.  These are the
> 			 * cpu_feature/cpu_id from the BSP cpu, now
> 			 * being given to another cpu.  This is
> 			 * bullshit.
> 			 */
> 			extern int cpu_id, cpu_feature;
> 			caa.cpu_signature = cpu_id;
> 			caa.feature_flags = cpu_feature;
> #endif
>
317,349c309,314
< 			role = (entry->madt_x2apic.apic_id == self_id) ?
< 			    CPU_ROLE_BP : CPU_ROLE_AP;
< 			if (!x2apic_cpus && (role != CPU_ROLE_BP)) {
< 				/* AP before BP: find/attach BP first else boot fails */
< 				caddr_t find = addr;
< 				while (find < (caddr_t)madt + madt->hdr.length) {
< 					union acpi_madt_entry *entry2 = (union acpi_madt_entry *)find;
< 					if (entry2->madt_lapic.apic_type != entry->madt_lapic.apic_type)
< 						break;
<
< 					int role2 = (entry2->madt_x2apic.apic_id == self_id) ?
< 					    CPU_ROLE_BP : CPU_ROLE_AP;
< 					printf("%s: %s: acpi_proc_uid %02x, apic_id %02x, flags 0x%x\n",
< 					    self->dv_xname,
< 					    role2 == CPU_ROLE_BP ? "BP-found" : "AP-defer",
< 					    entry2->madt_x2apic.acpi_proc_uid,
< 					    entry2->madt_x2apic.apic_id, entry2->madt_x2apic.flags);
< 					if (role2 == CPU_ROLE_BP) {
< 						acpimadt_attach_prep(&caa, role2,
< 						    entry2->madt_x2apic.apic_id,
< 						    entry2->madt_x2apic.acpi_proc_uid);
< 						config_found(mainbus, &caa, acpimadt_print);
< 						x2apic_cpus++;
< 						break;
< 					}
< 					find += entry2->madt_lapic.length;
< 				}
< 			}
< 			if (!x2apic_cpus || (role != CPU_ROLE_BP)) {
< 				acpimadt_attach_prep(&caa, role, entry->madt_x2apic.apic_id,
< 				    entry->madt_x2apic.acpi_proc_uid);
< 				config_found(mainbus, &caa, acpimadt_print);
< 				x2apic_cpus++;
---
> 			memset(&caa, 0, sizeof(struct cpu_attach_args));
> 			if (lapic_cpu_number() == entry->madt_x2apic.apic_id)
> 				caa.cpu_role = CPU_ROLE_BP;
> 			else {
> 				caa.cpu_role = CPU_ROLE_AP;
> 				ncpusfound++;
350a316,334
> 			caa.caa_name = "cpu";
> 			caa.cpu_apicid = entry->madt_x2apic.apic_id;
> 			caa.cpu_acpi_proc_id = entry->madt_x2apic.acpi_proc_uid;
> #ifdef MULTIPROCESSOR
> 			caa.cpu_func = &mp_cpu_funcs;
> #endif
> #ifdef __i386__
> 			/*
> 			 * XXX utterly wrong.  These are the
> 			 * cpu_feature/cpu_id from the BSP cpu, now
> 			 * being given to another cpu.  This is
> 			 * bullshit.
> 			 */
> 			extern int cpu_id, cpu_feature;
> 			caa.cpu_signature = cpu_id;
> 			caa.feature_flags = cpu_feature;
> #endif
>
> 			config_found(mainbus, &caa, acpimadt_print);

Sent with Proton Mail secure email.