Download raw body.
sys/ihidev: prevent crash on interrupt storm
Vitaliy,
Thanks for review!
On Mon, 23 Dec 2024 12:39:19 +0100,
Vitaliy Makkoveev <otto@bsdbox.dev> wrote:
>
> > On 23 Dec 2024, at 14:23, Vitaliy Makkoveev <otto@bsdbox.dev> wrote:
> >
> >> On 23 Dec 2024, at 13:21, Kirill A. Korinsky <kirill@korins.ky> wrote:
> >>
> >> tech@,
> >>
> >> Before I move forward with quirks, I'd like to fix the cause of a crash in a
> >> system on HONOR MagicBook Art 14 Snapdragon.
> >>
> >> Let's assume that we have a device on a machine I2C bus which goes into a
> >> so-called interrupt storm for some reason. Under this condition, the system
> >> crashes if an interrupt arrives before we allocate a buffer to read it.
> >>
> >> Here's an example of a trace:
> >>
> >> panic: attempt to access user address 0x0 from EL1
> >> Stopped at panic+0x140: cmp w21, #0x0
> >>
> >> TID PID UID PRFLAGS PFLAGS CPU COMMAND
> >> db_enter() at panic+0x13c
> >> panic() at kdata_abort+0x180
> >> do_el0_sync() at handle_el1h_sync+0x68
> >> handle_el1h_sync() at qciic_exec+0x2d4
> >> qciic_exec() at ihidev_intr+0x70
> >> ihidev_intr() at qcgpio_intr+0xac
> >> qcgpio_intr() at agintc_irq_handler+0x2bc
> >>
> >> Tested on the same machine, without any additional patches. System boots,
> >> but touchpad doesn't work.
> >>
> >> Ok?
> >>
> >
> >
> > This is not related to your diff, but since you are working in this area,
> > please look at the following code.
> >
> > timeout_del_barrier(9) looks very suspicious to me. As I understand, the
> > following code could be executed from interrupt context, so barrier can’t
> > be used here. Sorry if I’m wrong and this is always thread context.
> >
>
> This is not the thread context, even the timer is interrupt context :)
>
Well, ihidev_intr migth be called from different places, at least from:
1. a device interrup
2. a timeout for the poll
3. ikbd_cngetc
I think moving place where iic_intr_establish is called is a wrong way. The
right way is confirm inside ihidev_intr that devices is fully attached, and
don't try to poll anything from not attached devices.
Also, you're right regarding timeout_del_barrier. I have replaced them to
just timeout_del because this code is called from different contexts.
So, resulted diff which boot on HONOR MagicBook Art 14 Snapdragon:
Index: sys/dev/i2c/ihidev.c
===================================================================
RCS file: /home/cvs/src/sys/dev/i2c/ihidev.c,v
diff -u -p -r1.33 ihidev.c
--- sys/dev/i2c/ihidev.c 18 Oct 2024 12:53:49 -0000 1.33
+++ sys/dev/i2c/ihidev.c 23 Dec 2024 15:54:16 -0000
@@ -108,6 +108,7 @@ ihidev_attach(struct device *parent, str
int repid, repsz;
int repsizes[256];
+ sc->sc_attached = 0;
sc->sc_tag = ia->ia_tag;
sc->sc_addr = ia->ia_addr;
sc->sc_hid_desc_addr = ia->ia_size;
@@ -195,6 +196,8 @@ ihidev_attach(struct device *parent, str
sc->sc_subdevs[repid] = (struct ihidev *)dev;
}
+ sc->sc_attached = 1;
+
if (sc->sc_refcnt > 0)
return;
@@ -241,7 +244,7 @@ ihidev_activate(struct device *self, int
if (sc->sc_poll && timeout_initialized(&sc->sc_timer)) {
DPRINTF(("%s: cancelling polling\n",
sc->sc_dev.dv_xname));
- timeout_del_barrier(&sc->sc_timer);
+ timeout_del(&sc->sc_timer);
}
if (ihidev_hid_command(sc, I2C_HID_CMD_SET_POWER,
&I2C_HID_POWER_OFF))
@@ -639,11 +642,14 @@ ihidev_intr(void *arg)
u_char *p;
u_int rep = 0;
+ if (!sc->sc_attached)
+ return (0);
+
if (sc->sc_poll && !sc->sc_frompoll) {
DPRINTF(("%s: received interrupt while polling, disabling "
"polling\n", sc->sc_dev.dv_xname));
sc->sc_poll = 0;
- timeout_del_barrier(&sc->sc_timer);
+ timeout_del(&sc->sc_timer);
}
/*
Index: sys/dev/i2c/ihidev.h
===================================================================
RCS file: /home/cvs/src/sys/dev/i2c/ihidev.h,v
diff -u -p -r1.9 ihidev.h
--- sys/dev/i2c/ihidev.h 3 Sep 2022 15:48:16 -0000 1.9
+++ sys/dev/i2c/ihidev.h 23 Dec 2024 15:54:23 -0000
@@ -86,6 +86,7 @@ struct ihidev_softc {
u_int sc_isize;
u_char *sc_ibuf;
+ int sc_attached;
int sc_refcnt;
int sc_poll;
--
wbr, Kirill
sys/ihidev: prevent crash on interrupt storm