Index | Thread | Search

From:
Hrvoje Popovski <hrvoje@srce.hr>
Subject:
Re: Device errors with Xeon w5-2545
To:
tech@openbsd.org
Date:
Fri, 7 Feb 2025 15:52:18 +0100

Download raw body.

Thread
On 7.2.2025. 13:46, Jan Klemkow wrote:
> On Thu, Feb 06, 2025 at 02:17:44PM GMT, Mark Kettenis wrote:
>>> Date: Thu, 6 Feb 2025 11:35:00 +0100
>>> From: Jan Klemkow <jan@openbsd.org>
>>> I get some troubles trying OpenBSD on newer Intel Systems.
>>>
>>> I have two system with an ASRockRack W790D8UD-1L1N2T mainboard, but
>>> with two different CPUs w5-2545 and w5-3525.  The system with the
>>> w5-3525 CPU just works as expected.
>>>
>>> The system with the w5-2545 CPU does not.  It hangs during boot and I got
>>> several devices with errors while debugging this:
>>>
>>> ahci(4) reports failures on the first command timeout, due to a busy
>>> controller.  xhci(4) dies on the first interrupt due to 0xffffffff in the
>>> status register.  em(4) reports invalid checksum of the EEPROM during
>>> initialization.  Also the nvme(4) it not responsive.
>>>
>>> While the hang, its not possible to trap into ddb with db_console=1 and break
>>> via serial console.  Its possible to workaround the hand by disabling the
>>> xhci(4) driver via UKC.  But, the other devices still don't work.
>>>
>>> A boot of Debian/Linux-stable shows that all device operate normally here.
>>>
>>> I flashed the last BIOS from Vendor on both boards and compared all bios
>>> configuration options.
>>>
>>> Dmesgs of both systems are below.
>>>
>>> It looks like some trouble in the PCIe bus, to me.  Because, all PCIe
>>> devices have problems.  But, I can't find a debugging approach to start
>>> with, while looking in the code.
>>>
>>> Has anyone a hint for me, how to debug this?
>>
>> Try removing those ixl(4) interfaces from the non-working system.
> 
> I've removed them.  But, it has no effect.  The System still hangs and
> the devices have the same errors as before.
> 
> I also tried to find some quirks in the Linux kernel code for newer
> Intel CPU or chipsets without any success.  Is there something similar I
> can check?
> 
> Or has someone on the list here a running system with those Xeon w5-xxxx
> CPUs?
> 
> Thanks,
> Jan
> 

Is it better if you lower CPU cores to 4 or 8 in BIOS?