Download raw body.
nvme: avoid use-after-free after shutdown
Jonathan Matthew <jonathan@d14n.org> writes:
> On Wed, Feb 11, 2026 at 12:18:24PM -0500, Dave Voutila wrote:
>> Jonathan Matthew <jonathan@d14n.org> writes:
>>
>> > A recent report on bugs@ showed the nvme interrupt handler crashing
>> > during reboot. It's a bit weird that the nvme controller raises an
>> > interrupt after we shut it down, but we shouldn't crash if that
>> > happens.
>> >
>> > The problem here is that we delete the IO queue and free the memory
>> > used for it, but the interrupt handler unconditionally tries to process
>> > it. This diff just skips the IO queue if we've deleted it. The
>> > bug reporter has confirmed that this fixes it for them.
>> >
>> > ok?
>> >
>> >
>> > Index: nvme.c
>> > ===================================================================
>> > RCS file: /cvs/src/sys/dev/ic/nvme.c,v
>> > diff -u -p -r1.126 nvme.c
>> > --- nvme.c 14 Jan 2026 01:07:57 -0000 1.126
>> > +++ nvme.c 4 Feb 2026 04:45:59 -0000
>> > @@ -557,6 +574,7 @@ nvme_shutdown(struct nvme_softc *sc)
>> > printf("%s: unable to delete q, disabling\n", DEVNAME(sc));
>> > goto disable;
>> > }
>> > + sc->sc_q = NULL;
>>
>> I think this part can still race with a rogue interrupt because sc_q
>> will be non-NULL for some period of time after nvme_q_delete(), no?
>
> The nvme interrupt handler is not marked MPSAFE (yet), so this is all
> protected by the kernel lock.
>
Do we actually hold the kernel lock in nvme_shutdown()? Since it's
called via nvme_activate(), there are at least two areas where
config_suspend_all(DVACT_POWERDOWN) is called on amd64 (via boot() and
sleep_state()) and I honestly don't know if the callers there hold the
kernel lock.
I know acpi uses a taskq to asynchronously do suspend/hibernate
invocations of sleep_state() that could trigger this code path.
If sc_q is assumed protected by it, probably worth either an assert (if
we believe we're holding it) and annotating the nvme_softc for that
member.
> We should also do a better job of turning the interrupt off, but that's
> another diff or two.
>
>
>>
>> >
>> > cc = nvme_read4(sc, NVME_CC);
>> > CLR(cc, NVME_CC_SHN_MASK);
>> > @@ -1574,7 +1593,7 @@ nvme_intr(void *xsc)
>> > struct nvme_softc *sc = xsc;
>> > int rv = 0;
>> >
>> > - if (nvme_q_complete(sc, sc->sc_q))
>> > + if (sc->sc_q != NULL && nvme_q_complete(sc, sc->sc_q))
>> > rv = 1;
>> > if (nvme_q_complete(sc, sc->sc_admin_q))
>> > rv = 1;
>> >
>> >
>> >
nvme: avoid use-after-free after shutdown