Index | Thread | Search

From:
Jan Klemkow <jan@openbsd.org>
Subject:
Re: Device errors with Xeon w5-2545
To:
Mark Kettenis <mark.kettenis@xs4all.nl>
Cc:
tech@openbsd.org
Date:
Fri, 7 Feb 2025 13:46:47 +0100

Download raw body.

Thread
On Thu, Feb 06, 2025 at 02:17:44PM GMT, Mark Kettenis wrote:
> > Date: Thu, 6 Feb 2025 11:35:00 +0100
> > From: Jan Klemkow <jan@openbsd.org>
> > I get some troubles trying OpenBSD on newer Intel Systems.
> > 
> > I have two system with an ASRockRack W790D8UD-1L1N2T mainboard, but
> > with two different CPUs w5-2545 and w5-3525.  The system with the
> > w5-3525 CPU just works as expected.
> > 
> > The system with the w5-2545 CPU does not.  It hangs during boot and I got
> > several devices with errors while debugging this:
> > 
> > ahci(4) reports failures on the first command timeout, due to a busy
> > controller.  xhci(4) dies on the first interrupt due to 0xffffffff in the
> > status register.  em(4) reports invalid checksum of the EEPROM during
> > initialization.  Also the nvme(4) it not responsive.
> > 
> > While the hang, its not possible to trap into ddb with db_console=1 and break
> > via serial console.  Its possible to workaround the hand by disabling the
> > xhci(4) driver via UKC.  But, the other devices still don't work.
> > 
> > A boot of Debian/Linux-stable shows that all device operate normally here.
> > 
> > I flashed the last BIOS from Vendor on both boards and compared all bios
> > configuration options.
> > 
> > Dmesgs of both systems are below.
> > 
> > It looks like some trouble in the PCIe bus, to me.  Because, all PCIe
> > devices have problems.  But, I can't find a debugging approach to start
> > with, while looking in the code.
> > 
> > Has anyone a hint for me, how to debug this?
> 
> Try removing those ixl(4) interfaces from the non-working system.

I've removed them.  But, it has no effect.  The System still hangs and
the devices have the same errors as before.

I also tried to find some quirks in the Linux kernel code for newer
Intel CPU or chipsets without any success.  Is there something similar I
can check?

Or has someone on the list here a running system with those Xeon w5-xxxx
CPUs?

Thanks,
Jan