Download raw body.
bge/bnx/iavf/igc/ix/ixl/ngbe/pcn: ifq_restart() fix
From: Alexander Bluhm <bluhm@openbsd.org>
Subject: Re: bge/bnx/iavf/igc/ix/ixl/ngbe/pcn: ifq_restart() fix
Date: Mon, 23 Jun 2025 20:04:06 +0200
> On Fri, Jun 20, 2025 at 12:12:14PM +0200, Stefan Sperling wrote:
>> A bug has been fixed by yasuaok@ in vmx(4) where the driver was
>> calling ifq_restart() without actually having made any space on
>> a full Tx ring. Calling ifq_restart() in this case can lead to
>> a condition where the interface gets stuck in OACTIVE until the
>> interface is reset with ifconfig.
>>
>> I could trigger the same bug on ice(4) with iperf as follows:
>> for i in `seq 5`; do (iperf -l0 -t 0 -c 2001:db8::1 -u &) ; done
>> This needs another machine which runs iperf -u -s (and -V if using IPv6).
>>
>> I have checked all ethernet drivers in dev/pci which call ifq_restart().
>> A few already avoid calling ifq_restart() unnecessarily.
>> The same bug affects bge, bnx, iavf, igc, ix, ixl, ngbe, and pcn.
>>
>> The bad pattern looks like:
>>
>> while (cons != prod) {
>> descriptor = &ring[cons];
>>
>> if (descriptor is not done)
>> break;
>>
>> free descriptor;
>> }
>>
>> if (ifq_is_oactive(ifq))
>> ifq_restart(ifq);
>>
>> If we break out of the loop without freeing any descriptor, we will
>> call ifq_restart() without having made any space on the ring.
>> For example, this can happen when drivers invoke both Tx and Rx interrupt
>> handlers during the same MSIX interrupt cause, An Rx interrupt can then
>> lead to processing of a Tx ring which has made no progress.
>>
>> Fixed code looks like:
>>
>> done = 0;
>>
>> while (cons != prod) {
>> desc = &ring[cons];
>>
>> if (desc is not done)
>> break;
>>
>> free descriptor;
>> done = 1;
>> }
>>
>> if (done && ifq_is_oactive(ifq))
>> ifq_restart(ifq);
>>
>>
>> I am looking for tests and OKs.
>> I myself don't have any hardware for these drivers to test with.
>
> I have tested ix, ixl, igc. OK bluhm@ for these.
>
> In August server room reconstruction should be finished, then I
> will have a setup with bge again.
>
> jan@ could test iavf, but it needs manual setup and is not that
> easy.
>
> And what should we do with the drivers noone has hardware and wants
> to test? Just commit?
>
> Any volunteers for bnx, iavf, ngbe, pcn?
I tested the iavf patch on my ESXi virtual machine. It works for me.
I don't see the stuck issue with the iavf driver, but the patch is harmless.
--
Yuichiro NAITO (naito.yuichiro@gmail.com)
bge/bnx/iavf/igc/ix/ixl/ngbe/pcn: ifq_restart() fix