Index | Thread | Search

From:
Stefan Fritsch <sf@sfritsch.de>
Subject:
Re: Mostly-Inverted CPU-Core ordering
To:
Greg Schaefer <gsgs7878@proton.me>
Cc:
"tech@openbsd.org" <tech@openbsd.org>
Date:
Tue, 13 Jan 2026 15:23:48 +0100

Download raw body.

Thread
On Mon, 12 Jan 2026, Greg Schaefer wrote:

> While debugging another issue, I noticed that amd64/cpu_attach (and others
> it appears) reorder the ACPI x2APIC CPU list. In a hybrid world, think that
> Intel recommends x2APIC ordering by "CPU power" (P-cores first, E-cores next,
> LP-cores last). cpu_info_list contains a static primary (boot-CPU) record
> and a linked-list to the remaining application processors. The boot core is
> copied to the static allocation, but the remaining cores are head-inserted
> (reversed) into the linked-list.
> 
>         case CPU_ROLE_AP:
> #if defined(MULTIPROCESSOR)
>                 cpu_intr_init(ci);
>                 cpu_start_secondary(ci);
>                 clockqueue_init(&ci->ci_queue);
>                 sched_init_cpu(ci);
>                 ncpus++;
>                 if (ci->ci_flags & CPUF_PRESENT) {
>  **                      ci->ci_next = cpu_info_list->ci_next;
>  **                      cpu_info_list->ci_next = ci;
> 
> Using the example of a hybrid process with 2 P-cores, 2 E-cores and 2 LP-cores,
> the resulting ordering is P-Core 0, LP-Core 1, LP-Core 0, E-Core 1, E-core 0,
> P-core 0. When NICs setup their interrupt queues (often power-of-two), via
> intrmap_create(), they are striped across the cores in that mostly-reversed
> order.
> 
> Unclear what, if any, performance impact this has, but it can make debugging
> interesting. The simple fix is appending to cpu_info_list instead of reverse
> head-inserting. Please ignore if a known/non-issue.

I can confirm your analysis. I don't have access to a hybrid CPU system
at the moment, but I have Linux cpuinfo and acpi table infos from two
Dell laptops with Core i7-1270P and Core i7-1370P and they seem to have
the P-cores before the E-cores in the MADT. Also, if I enable INTRDEBUG
in amd64/intr.c, I see that CPU_INFO_FOREACH traverses the CPUs in the
order 0,5,4,3,2,1 on a 6 VCPU VM and intrmap_create() uses
CPU_INFO_FOREACH.

I also agree that one probably wants to have multi-queue NICs use the
CPUs with the highest performance and re-ordering the cpu_info_list
seems to have that effect at least on intel.  The amd64 diff below puts
them in the correct order and seems to work. But the same code seems to
be in most (all?) other MP architectures.

I don't know what effect the different ordering could have on non-hybrid
systems. On multi-socket systems, the interrupt distribution over the
sockets may change, but I don't see a problem with that. The order in
which IPIs are send will also change, and the scheduler will likely also
do things slightly differently, but I have no idea if that would have a
measurable impact. Anything else?


diff --git a/sys/arch/amd64/amd64/cpu.c b/sys/arch/amd64/amd64/cpu.c
index 18cfb8d6377..b2e59ae800f 100644
--- a/sys/arch/amd64/amd64/cpu.c
+++ b/sys/arch/amd64/amd64/cpu.c
@@ -583,7 +583,7 @@ cpu_attach(struct device *parent, struct device *self, void *aux)
 {
 	struct cpu_softc *sc = (void *) self;
 	struct cpu_attach_args *caa = aux;
-	struct cpu_info *ci;
+	struct cpu_info *ci, *ci_last;
 #if defined(MULTIPROCESSOR)
 	int cpunum = sc->sc_dev.dv_unit;
 	vaddr_t kstack;
@@ -729,8 +729,10 @@ cpu_attach(struct device *parent, struct device *self, void *aux)
 		sched_init_cpu(ci);
 		ncpus++;
 		if (ci->ci_flags & CPUF_PRESENT) {
-			ci->ci_next = cpu_info_list->ci_next;
-			cpu_info_list->ci_next = ci;
+			ci_last = cpu_info_list;
+			while (ci_last->ci_next != NULL)
+				ci_last = ci_last->ci_next;
+			ci_last->ci_next = ci;
 		}
 #else
 		printf("%s: not started\n", sc->sc_dev.dv_xname);