Download raw body.
Don't release NET_LOCK in vio(4) ioctl
> Date: Wed, 24 Apr 2024 22:25:01 +0200 (CEST)
> From: Stefan Fritsch <sf@sfritsch.de>
>
> Hi,
>
> it seems I went down the wrong rabbit hole with my first diff. The actual
> issue is that signals are not handled correctly while sleeping. After a
> signal, there is a race condition where sc_ctrl_inuse is first set to FREE
> and then the interrupt handler sets it to DONE, causing a hang in the next
> vio_wait_ctrl() call. This was probably not noticed before NET_LOCK
> existed, because an ifconfig down/up fixed it. However since the NET_LOCK
> now serializes the relevant ioctl paths, a parallel ifconfig down now
> hangs too, waiting for the NET_LOCK. The attached patch does three things:
>
> * Revert the NET_LOCK unlocking work-around.
>
> * Remove PCATCH from the sleep calls so that we always wait for the control
> requests to complete, avoiding the race with vio_ctrleof().
What guarantee is there that those control requests complete?
> * Avoid a crash if there is outgoing traffic while doing ifconfig down.
> The proper fix for this would be intr_barrier() but I think that would
> need more refactoring to avoid problems with the NET_LOCK.
>
> With this I have seen neither crashes nor deadlocks. I would like to
> commit this and do further refactoring in separate steps.
>
> With the NET_LOCK, the code to handle control requests concurrent to a
> device reset cannot be hit anymore. Also it cannot happen that we have to
> wait for a free slot in the control queue. If we assume that there will
> always be some lock serializing the ioctl path (maybe with some
> per-interface lock to replace NET_LOCK), this code could be removed. Or
> vio could introduce its own cfg_lock like ixl(4) or cad(4).
>
> Comments, testers (especially with v6 autoconf), oks would be very
> welcome.
>
> Cheers,
> Stefan
>
> diff --git a/sys/dev/pv/if_vio.c b/sys/dev/pv/if_vio.c
> index 9ad8e36e346..7a49d1a8fdf 100644
> --- a/sys/dev/pv/if_vio.c
> +++ b/sys/dev/pv/if_vio.c
> @@ -1232,13 +1232,17 @@ vio_txeof(struct virtqueue *vq)
>
> while (virtio_dequeue(vsc, vq, &slot, &len) == 0) {
> struct virtio_net_hdr *hdr = &sc->sc_tx_hdrs[slot];
> + m = sc->sc_tx_mbufs[slot];
> + if (m == NULL) {
> + // interface is down by now
Don't use C++-style comments.
> + continue;
> + }
> r++;
> VIO_DMAMEM_SYNC(vsc, sc, hdr, sc->sc_hdr_size,
> BUS_DMASYNC_POSTWRITE);
> bus_dmamap_sync(vsc->sc_dmat, sc->sc_tx_dmamaps[slot], 0,
> sc->sc_tx_dmamaps[slot]->dm_mapsize,
> BUS_DMASYNC_POSTWRITE);
> - m = sc->sc_tx_mbufs[slot];
> bus_dmamap_unload(vsc->sc_dmat, sc->sc_tx_dmamaps[slot]);
> sc->sc_tx_mbufs[slot] = NULL;
> virtio_dequeue_commit(vq, slot);
> @@ -1363,32 +1367,13 @@ out:
> return r;
> }
>
> -/*
> - * XXXSMP As long as some per-ifp ioctl(2)s are executed with the
> - * NET_LOCK() deadlocks are possible. So release it here.
> - */
> -static inline int
> -vio_sleep(struct vio_softc *sc, const char *wmesg)
> -{
> - int status = rw_status(&netlock);
> -
> - if (status != RW_WRITE && status != RW_READ)
> - return tsleep_nsec(&sc->sc_ctrl_inuse, PRIBIO|PCATCH, wmesg,
> - INFSLP);
> -
> - return rwsleep_nsec(&sc->sc_ctrl_inuse, &netlock, PRIBIO|PCATCH, wmesg,
> - INFSLP);
> -}
> -
> int
> vio_wait_ctrl(struct vio_softc *sc)
> {
> int r = 0;
>
> while (sc->sc_ctrl_inuse != FREE) {
> - r = vio_sleep(sc, "viowait");
> - if (r == EINTR)
> - return r;
> + r = tsleep_nsec(&sc->sc_ctrl_inuse, PRIBIO, "viowait", INFSLP);
> }
> sc->sc_ctrl_inuse = INUSE;
>
> @@ -1402,12 +1387,10 @@ vio_wait_ctrl_done(struct vio_softc *sc)
>
> while (sc->sc_ctrl_inuse != DONE && sc->sc_ctrl_inuse != RESET) {
> if (sc->sc_ctrl_inuse == RESET) {
> - r = 1;
> + r = ENXIO;
> break;
> }
> - r = vio_sleep(sc, "viodone");
> - if (r == EINTR)
> - break;
> + r = tsleep_nsec(&sc->sc_ctrl_inuse, PRIBIO, "viodone", INFSLP);
> }
> return r;
> }
>
>
Don't release NET_LOCK in vio(4) ioctl