Download raw body.
vio(4): recover from missed RX interrupts in vio_rxtick
On my Intel(R) Atom(TM) CPU C3558 with 4 cores and 14 vms I feel
like this has a positive impact on the vms network interfaces.
I am running this diff on all vms since end of February and prior
to this diff I would see one vm (vio interface) lose connection
every few days.
I still see some races when starting and stopping vms, especially
after a physical reboot (vionet_rx: driver not ready) but once the
interface is up, it stays working with this.
On Sun, Feb 22, 2026 at 07:22:16PM +0100, Renaud Allard wrote:
> Hi,
>
> I've been running an OpenBSD 7.8 VM on Oracle Cloud (arm64, KVM) with a
> vio(4) interface doing sustained 50-100 Mbps. Every few days, the
> interface goes completely dead -- no packets in or out. A reboot from
> the cloud console fixes it for another few days.
>
> I traced the problem through the driver and I believe the root cause is
> that vio_rxtick doesn't poll the RX used ring the way vio_txtick polls
> the TX used ring. If an RX interrupt gets lost (which can happen with
> EVENT_IDX -- the man page already has flag 0x2 as a workaround for
> exactly this class of bug), the RX side has no way to recover.
>
> Here's the diff, then the explanation.
>
> Index: sys/dev/pv/if_vio.c
> ===================================================================
> RCS file: /cvs/src/sys/dev/pv/if_vio.c,v
> retrieving revision 1.78
> diff -u -p -r1.78 if_vio.c
> --- sys/dev/pv/if_vio.c 15 Jan 2026 09:06:19 -0000 1.78
> +++ sys/dev/pv/if_vio.c 22 Feb 2026 00:00:00 -0000
> @@ -1661,6 +1661,7 @@ vio_rxtick(void *arg)
> int i;
>
> for (i = 0; i < sc->sc_nqueues; i++) {
> + virtio_check_vq(sc->sc_virtio, sc->sc_q[i].viq_rxvq);
> mtx_enter(&sc->sc_q[i].viq_rxmtx);
> vio_populate_rx_mbufs(sc, &sc->sc_q[i]);
> mtx_leave(&sc->sc_q[i].viq_rxmtx);
>
>
> The problem
> -----------
>
> There's an asymmetry between how the TX and RX timer handlers work.
>
> vio_txtick calls virtio_check_vq on the TX used ring. The comment
> above vio_tx_intr explains why:
>
> vio_txtick is used to make sure that mbufs are dequeued and freed
> even if no further transfer happens.
>
> So if a TX interrupt is lost, vio_txtick picks up the slack within a
> second. This is the right thing to do.
>
> vio_rxtick, on the other hand, only calls vio_populate_rx_mbufs, which
> adds new buffers to the available ring. It never looks at the used
> ring. If an RX interrupt is lost, nobody ever drains the completed
> packets.
>
>
> What happens when an RX interrupt is lost
> ------------------------------------------
>
> When VIRTIO_F_RING_EVENT_IDX is negotiated (the default), the driver
> tells the host "interrupt me when the used index reaches N" by writing
> N into VQ_USED_EVENT. The host is supposed to compare its used index
> against this value and fire an interrupt when it crosses the threshold.
>
> The virtio spec (2.7.7.1) says this mechanism is "not reliable, as
> they are not synchronized with the device." The vio(4) man page
> documents flag 0x2 as a workaround for hosts that get this wrong.
> Oracle Cloud's KVM on arm64 appears to be one of those hosts.
>
> When the interrupt doesn't arrive, here's what happens:
>
> 1. Completed packets pile up in the used ring. vio_rxeof never runs,
> so if_rxr_put never frees those ring slots.
>
> 2. vio_rxtick fires every second and calls vio_populate_rx_mbufs.
> But if_rxr_get says there are zero free slots -- the driver thinks
> all the buffers are still in flight. So nothing gets added.
>
> 3. The host runs out of available buffers and starts dropping packets.
> No RX means no TCP ACKs, TX dries up, the interface is dead.
>
> 4. This loops forever. vio_rxtick keeps firing, keeps finding zero
> free slots, keeps doing nothing. Only a reboot recovers.
>
>
> Why the fix works
> -----------------
>
> Adding virtio_check_vq before the existing vio_populate_rx_mbufs call
> means: first check if there's unprocessed work in the used ring, drain
> it if there is, then refill the available ring.
>
> On the normal path (interrupts working fine), virtio_check_vq sees
> vq_used_idx == vq_used->idx and returns immediately. One DMA sync
> and one integer comparison, once a second. No measurable cost.
>
> On the recovery path (missed interrupt), virtio_check_vq finds stale
> completions, calls vio_rx_intr, which drains them via vio_rxeof, frees
> the ring slots, refills the available ring, and re-enables interrupts.
> Normal operation resumes within one second.
>
>
> Why virtio_check_vq goes before the mutex
> ------------------------------------------
>
> virtio_check_vq calls vq->vq_done, which for RX is vio_rx_intr.
> vio_rx_intr takes viq_rxmtx internally, so it handles its own locking.
> The call has to go before the mtx_enter/vio_populate_rx_mbufs/mtx_leave
> block so that the used ring is drained first, freeing slots, and then
> vio_populate_rx_mbufs can refill them -- all in the same tick.
>
> If it went after, vio_populate_rx_mbufs would still find zero free
> slots, do nothing, and the refill would have to wait for the next tick.
>
>
> Why it's safe to call without the mutex
> ----------------------------------------
>
> Every existing caller of virtio_check_vq calls it without holding
> viq_rxmtx. This is the established pattern throughout the driver:
>
> - vio_txtick calls it from timeout context, no mutex
> - vio_queue_intr calls it from interrupt context, no mutex
> - virtio_pci_queue_intr calls it from interrupt context, no mutex
> - virtio_pci_shared_queue_intr, same
> - virtio_pci_legacy_intr_mpsafe, same
>
> The RX callback (vio_rx_intr) acquires viq_rxmtx as its first action,
> so the locking is self-contained. The new call in vio_rxtick is
> identical in pattern to the existing call in vio_txtick -- same
> function, same context, same convention.
>
>
> Interrupt modes
> ---------------
>
> I checked all four interrupt configurations to make sure the fix is
> useful in each:
>
> 1. Child-managed MSI-X (multi-queue): TX and RX share a vector per
> queue pair via vio_queue_intr, which checks both. If the host
> suppresses the interrupt entirely, neither gets checked.
>
> 2. Per-VQ MSI-X: each VQ has its own vector. A lost RX interrupt
> cannot be recovered by a TX interrupt at all.
>
> 3. Shared MSI-X: all VQs share one vector. A TX interrupt also
> checks RX. But under heavy inbound-only traffic, TX interrupts
> become infrequent.
>
> 4. Legacy: same as shared.
>
> In all four cases, vio_rxtick is the only guaranteed periodic check,
> and it's the only thing that can reliably recover from a lost RX
> interrupt regardless of traffic pattern.
>
>
> Best Regards
vio(4): recover from missed RX interrupts in vio_rxtick