From: Moritz Buhl Subject: Re: vio(4): recover from missed RX interrupts in vio_rxtick To: tech@openbsd.org Date: Wed, 15 Apr 2026 04:58:31 +0200 On my Intel(R) Atom(TM) CPU C3558 with 4 cores and 14 vms I feel like this has a positive impact on the vms network interfaces. I am running this diff on all vms since end of February and prior to this diff I would see one vm (vio interface) lose connection every few days. I still see some races when starting and stopping vms, especially after a physical reboot (vionet_rx: driver not ready) but once the interface is up, it stays working with this. On Sun, Feb 22, 2026 at 07:22:16PM +0100, Renaud Allard wrote: > Hi, > > I've been running an OpenBSD 7.8 VM on Oracle Cloud (arm64, KVM) with a > vio(4) interface doing sustained 50-100 Mbps. Every few days, the > interface goes completely dead -- no packets in or out. A reboot from > the cloud console fixes it for another few days. > > I traced the problem through the driver and I believe the root cause is > that vio_rxtick doesn't poll the RX used ring the way vio_txtick polls > the TX used ring. If an RX interrupt gets lost (which can happen with > EVENT_IDX -- the man page already has flag 0x2 as a workaround for > exactly this class of bug), the RX side has no way to recover. > > Here's the diff, then the explanation. > > Index: sys/dev/pv/if_vio.c > =================================================================== > RCS file: /cvs/src/sys/dev/pv/if_vio.c,v > retrieving revision 1.78 > diff -u -p -r1.78 if_vio.c > --- sys/dev/pv/if_vio.c 15 Jan 2026 09:06:19 -0000 1.78 > +++ sys/dev/pv/if_vio.c 22 Feb 2026 00:00:00 -0000 > @@ -1661,6 +1661,7 @@ vio_rxtick(void *arg) > int i; > > for (i = 0; i < sc->sc_nqueues; i++) { > + virtio_check_vq(sc->sc_virtio, sc->sc_q[i].viq_rxvq); > mtx_enter(&sc->sc_q[i].viq_rxmtx); > vio_populate_rx_mbufs(sc, &sc->sc_q[i]); > mtx_leave(&sc->sc_q[i].viq_rxmtx); > > > The problem > ----------- > > There's an asymmetry between how the TX and RX timer handlers work. > > vio_txtick calls virtio_check_vq on the TX used ring. The comment > above vio_tx_intr explains why: > > vio_txtick is used to make sure that mbufs are dequeued and freed > even if no further transfer happens. > > So if a TX interrupt is lost, vio_txtick picks up the slack within a > second. This is the right thing to do. > > vio_rxtick, on the other hand, only calls vio_populate_rx_mbufs, which > adds new buffers to the available ring. It never looks at the used > ring. If an RX interrupt is lost, nobody ever drains the completed > packets. > > > What happens when an RX interrupt is lost > ------------------------------------------ > > When VIRTIO_F_RING_EVENT_IDX is negotiated (the default), the driver > tells the host "interrupt me when the used index reaches N" by writing > N into VQ_USED_EVENT. The host is supposed to compare its used index > against this value and fire an interrupt when it crosses the threshold. > > The virtio spec (2.7.7.1) says this mechanism is "not reliable, as > they are not synchronized with the device." The vio(4) man page > documents flag 0x2 as a workaround for hosts that get this wrong. > Oracle Cloud's KVM on arm64 appears to be one of those hosts. > > When the interrupt doesn't arrive, here's what happens: > > 1. Completed packets pile up in the used ring. vio_rxeof never runs, > so if_rxr_put never frees those ring slots. > > 2. vio_rxtick fires every second and calls vio_populate_rx_mbufs. > But if_rxr_get says there are zero free slots -- the driver thinks > all the buffers are still in flight. So nothing gets added. > > 3. The host runs out of available buffers and starts dropping packets. > No RX means no TCP ACKs, TX dries up, the interface is dead. > > 4. This loops forever. vio_rxtick keeps firing, keeps finding zero > free slots, keeps doing nothing. Only a reboot recovers. > > > Why the fix works > ----------------- > > Adding virtio_check_vq before the existing vio_populate_rx_mbufs call > means: first check if there's unprocessed work in the used ring, drain > it if there is, then refill the available ring. > > On the normal path (interrupts working fine), virtio_check_vq sees > vq_used_idx == vq_used->idx and returns immediately. One DMA sync > and one integer comparison, once a second. No measurable cost. > > On the recovery path (missed interrupt), virtio_check_vq finds stale > completions, calls vio_rx_intr, which drains them via vio_rxeof, frees > the ring slots, refills the available ring, and re-enables interrupts. > Normal operation resumes within one second. > > > Why virtio_check_vq goes before the mutex > ------------------------------------------ > > virtio_check_vq calls vq->vq_done, which for RX is vio_rx_intr. > vio_rx_intr takes viq_rxmtx internally, so it handles its own locking. > The call has to go before the mtx_enter/vio_populate_rx_mbufs/mtx_leave > block so that the used ring is drained first, freeing slots, and then > vio_populate_rx_mbufs can refill them -- all in the same tick. > > If it went after, vio_populate_rx_mbufs would still find zero free > slots, do nothing, and the refill would have to wait for the next tick. > > > Why it's safe to call without the mutex > ---------------------------------------- > > Every existing caller of virtio_check_vq calls it without holding > viq_rxmtx. This is the established pattern throughout the driver: > > - vio_txtick calls it from timeout context, no mutex > - vio_queue_intr calls it from interrupt context, no mutex > - virtio_pci_queue_intr calls it from interrupt context, no mutex > - virtio_pci_shared_queue_intr, same > - virtio_pci_legacy_intr_mpsafe, same > > The RX callback (vio_rx_intr) acquires viq_rxmtx as its first action, > so the locking is self-contained. The new call in vio_rxtick is > identical in pattern to the existing call in vio_txtick -- same > function, same context, same convention. > > > Interrupt modes > --------------- > > I checked all four interrupt configurations to make sure the fix is > useful in each: > > 1. Child-managed MSI-X (multi-queue): TX and RX share a vector per > queue pair via vio_queue_intr, which checks both. If the host > suppresses the interrupt entirely, neither gets checked. > > 2. Per-VQ MSI-X: each VQ has its own vector. A lost RX interrupt > cannot be recovered by a TX interrupt at all. > > 3. Shared MSI-X: all VQs share one vector. A TX interrupt also > checks RX. But under heavy inbound-only traffic, TX interrupts > become infrequent. > > 4. Legacy: same as shared. > > In all four cases, vio_rxtick is the only guaranteed periodic check, > and it's the only thing that can reliably recover from a lost RX > interrupt regardless of traffic pattern. > > > Best Regards