Index | Thread | Search

From:
Alexander Bluhm <bluhm@openbsd.org>
Subject:
Re: Don't release NET_LOCK in vio(4) ioctl
To:
Stefan Fritsch <sf@sfritsch.de>
Cc:
Vitaliy Makkoveev <mvs@openbsd.org>, tech@openbsd.org
Date:
Thu, 25 Apr 2024 16:22:57 +0200

Download raw body.

Thread
  • Martin Pieuchot:

    Don't release NET_LOCK in vio(4) ioctl

  • On Wed, Apr 24, 2024 at 10:25:01PM +0200, Stefan Fritsch wrote:
    > Hi,
    > 
    > it seems I went down the wrong rabbit hole with my first diff. The actual 
    > issue is that signals are not handled correctly while sleeping. After a 
    > signal, there is a race condition where sc_ctrl_inuse is first set to FREE 
    > and then the interrupt handler sets it to DONE, causing a hang in the next 
    > vio_wait_ctrl() call. This was probably not noticed before NET_LOCK 
    > existed, because an ifconfig down/up fixed it. However since the NET_LOCK 
    > now serializes the relevant ioctl paths, a parallel ifconfig down now 
    > hangs too, waiting for the NET_LOCK. The attached patch does three things:
    > 
    > * Revert the NET_LOCK unlocking work-around.
    > 
    > * Remove PCATCH from the sleep calls so that we always wait for the control
    >   requests to complete, avoiding the race with vio_ctrleof().
    > 
    > * Avoid a crash if there is outgoing traffic while doing ifconfig down. 
    >   The proper fix for this would be intr_barrier() but I think that would 
    >   need more refactoring to avoid problems with the NET_LOCK.
    > 
    > With this I have seen neither crashes nor deadlocks. I would like to 
    > commit this and do further refactoring in separate steps.
    > 
    > With the NET_LOCK, the code to handle control requests concurrent to a 
    > device reset cannot be hit anymore. Also it cannot happen that we have to 
    > wait for a free slot in the control queue. If we assume that there will 
    > always be some lock serializing the ioctl path (maybe with some 
    > per-interface lock to replace NET_LOCK), this code could be removed. Or 
    > vio could introduce its own cfg_lock like ixl(4) or cad(4).
    > 
    > Comments, testers (especially with v6 autoconf), oks would be very 
    > welcome.
    
    Other interface drivers check ISSET(ifp->if_flags, IFF_RUNNING))
    before calling txeof().  Any reason why you check sc->sc_tx_mbufs[slot]
    == NULL?  According to the comment it is the same purpose.
    
    > diff --git a/sys/dev/pv/if_vio.c b/sys/dev/pv/if_vio.c
    > index 9ad8e36e346..7a49d1a8fdf 100644
    > --- a/sys/dev/pv/if_vio.c
    > +++ b/sys/dev/pv/if_vio.c
    > @@ -1232,13 +1232,17 @@ vio_txeof(struct virtqueue *vq)
    >  
    >  	while (virtio_dequeue(vsc, vq, &slot, &len) == 0) {
    >  		struct virtio_net_hdr *hdr = &sc->sc_tx_hdrs[slot];
    > +		m = sc->sc_tx_mbufs[slot];
    > +		if (m == NULL) {
    > +			// interface is down by now
    > +			continue;
    > +		}
    >  		r++;
    >  		VIO_DMAMEM_SYNC(vsc, sc, hdr, sc->sc_hdr_size,
    >  		    BUS_DMASYNC_POSTWRITE);
    >  		bus_dmamap_sync(vsc->sc_dmat, sc->sc_tx_dmamaps[slot], 0,
    >  		    sc->sc_tx_dmamaps[slot]->dm_mapsize,
    >  		    BUS_DMASYNC_POSTWRITE);
    > -		m = sc->sc_tx_mbufs[slot];
    >  		bus_dmamap_unload(vsc->sc_dmat, sc->sc_tx_dmamaps[slot]);
    >  		sc->sc_tx_mbufs[slot] = NULL;
    >  		virtio_dequeue_commit(vq, slot);
    > @@ -1363,32 +1367,13 @@ out:
    >  	return r;
    >  }
    >  
    > -/*
    > - * XXXSMP As long as some per-ifp ioctl(2)s are executed with the
    > - * NET_LOCK() deadlocks are possible.  So release it here.
    > - */
    > -static inline int
    > -vio_sleep(struct vio_softc *sc, const char *wmesg)
    > -{
    > -	int status = rw_status(&netlock);
    > -
    > -	if (status != RW_WRITE && status != RW_READ)
    > -		return tsleep_nsec(&sc->sc_ctrl_inuse, PRIBIO|PCATCH, wmesg,
    > -		    INFSLP);
    > -
    > -	return rwsleep_nsec(&sc->sc_ctrl_inuse, &netlock, PRIBIO|PCATCH, wmesg,
    > -	    INFSLP);
    > -}
    > -
    >  int
    >  vio_wait_ctrl(struct vio_softc *sc)
    >  {
    >  	int r = 0;
    >  
    >  	while (sc->sc_ctrl_inuse != FREE) {
    > -		r = vio_sleep(sc, "viowait");
    > -		if (r == EINTR)
    > -			return r;
    > +		r = tsleep_nsec(&sc->sc_ctrl_inuse, PRIBIO, "viowait", INFSLP);
    >  	}
    >  	sc->sc_ctrl_inuse = INUSE;
    >  
    > @@ -1402,12 +1387,10 @@ vio_wait_ctrl_done(struct vio_softc *sc)
    >  
    >  	while (sc->sc_ctrl_inuse != DONE && sc->sc_ctrl_inuse != RESET) {
    >  		if (sc->sc_ctrl_inuse == RESET) {
    > -			r = 1;
    > +			r = ENXIO;
    >  			break;
    >  		}
    > -		r = vio_sleep(sc, "viodone");
    > -		if (r == EINTR)
    > -			break;
    > +		r = tsleep_nsec(&sc->sc_ctrl_inuse, PRIBIO, "viodone", INFSLP);
    >  	}
    >  	return r;
    >  }
    
    
    
  • Martin Pieuchot:

    Don't release NET_LOCK in vio(4) ioctl