Index | Thread | Search

From:
Dave Voutila <dv@sisu.io>
Subject:
Re: nvme: avoid use-after-free after shutdown
To:
Jonathan Matthew <jonathan@d14n.org>
Cc:
tech@openbsd.org
Date:
Thu, 12 Feb 2026 10:52:27 -0500

Download raw body.

Thread
Jonathan Matthew <jonathan@d14n.org> writes:

> On Wed, Feb 11, 2026 at 12:18:24PM -0500, Dave Voutila wrote:
>> Jonathan Matthew <jonathan@d14n.org> writes:
>>
>> > A recent report on bugs@ showed the nvme interrupt handler crashing
>> > during reboot.  It's a bit weird that the nvme controller raises an
>> > interrupt after we shut it down, but we shouldn't crash if that
>> > happens.
>> >
>> > The problem here is that we delete the IO queue and free the memory
>> > used for it, but the interrupt handler unconditionally tries to process
>> > it.  This diff just skips the IO queue if we've deleted it.  The
>> > bug reporter has confirmed that this fixes it for them.
>> >
>> > ok?
>> >
>> >
>> > Index: nvme.c
>> > ===================================================================
>> > RCS file: /cvs/src/sys/dev/ic/nvme.c,v
>> > diff -u -p -r1.126 nvme.c
>> > --- nvme.c	14 Jan 2026 01:07:57 -0000	1.126
>> > +++ nvme.c	4 Feb 2026 04:45:59 -0000
>> > @@ -557,6 +574,7 @@ nvme_shutdown(struct nvme_softc *sc)
>> >  		printf("%s: unable to delete q, disabling\n", DEVNAME(sc));
>> >  		goto disable;
>> >  	}
>> > +	sc->sc_q = NULL;
>>
>> I think this part can still race with a rogue interrupt because sc_q
>> will be non-NULL for some period of time after nvme_q_delete(), no?
>
> The nvme interrupt handler is not marked MPSAFE (yet), so this is all
> protected by the kernel lock.
>

Do we actually hold the kernel lock in nvme_shutdown()? Since it's
called via nvme_activate(), there are at least two areas where
config_suspend_all(DVACT_POWERDOWN) is called on amd64 (via boot() and
sleep_state()) and I honestly don't know if the callers there hold the
kernel lock.

I know acpi uses a taskq to asynchronously do suspend/hibernate
invocations of sleep_state() that could trigger this code path.

If sc_q is assumed protected by it, probably worth either an assert (if
we believe we're holding it) and annotating the nvme_softc for that
member.

> We should also do a better job of turning the interrupt off, but that's
> another diff or two.
>
>
>>
>> >
>> >  	cc = nvme_read4(sc, NVME_CC);
>> >  	CLR(cc, NVME_CC_SHN_MASK);
>> > @@ -1574,7 +1593,7 @@ nvme_intr(void *xsc)
>> >  	struct nvme_softc *sc = xsc;
>> >  	int rv = 0;
>> >
>> > -	if (nvme_q_complete(sc, sc->sc_q))
>> > +	if (sc->sc_q != NULL && nvme_q_complete(sc, sc->sc_q))
>> >  		rv = 1;
>> >  	if (nvme_q_complete(sc, sc->sc_admin_q))
>> >  		rv = 1;
>> >
>> >
>> >