From: Kirill A. Korinsky Subject: Re: sys/ihidev: prevent crash on interrupt storm To: Vitaliy Makkoveev Cc: OpenBSD tech Date: Mon, 23 Dec 2024 16:58:39 +0100 Vitaliy, Thanks for review! On Mon, 23 Dec 2024 12:39:19 +0100, Vitaliy Makkoveev wrote: > > > On 23 Dec 2024, at 14:23, Vitaliy Makkoveev wrote: > > > >> On 23 Dec 2024, at 13:21, Kirill A. Korinsky wrote: > >> > >> tech@, > >> > >> Before I move forward with quirks, I'd like to fix the cause of a crash in a > >> system on HONOR MagicBook Art 14 Snapdragon. > >> > >> Let's assume that we have a device on a machine I2C bus which goes into a > >> so-called interrupt storm for some reason. Under this condition, the system > >> crashes if an interrupt arrives before we allocate a buffer to read it. > >> > >> Here's an example of a trace: > >> > >> panic: attempt to access user address 0x0 from EL1 > >> Stopped at panic+0x140: cmp w21, #0x0 > >> > >> TID PID UID PRFLAGS PFLAGS CPU COMMAND > >> db_enter() at panic+0x13c > >> panic() at kdata_abort+0x180 > >> do_el0_sync() at handle_el1h_sync+0x68 > >> handle_el1h_sync() at qciic_exec+0x2d4 > >> qciic_exec() at ihidev_intr+0x70 > >> ihidev_intr() at qcgpio_intr+0xac > >> qcgpio_intr() at agintc_irq_handler+0x2bc > >> > >> Tested on the same machine, without any additional patches. System boots, > >> but touchpad doesn't work. > >> > >> Ok? > >> > > > > > > This is not related to your diff, but since you are working in this area, > > please look at the following code. > > > > timeout_del_barrier(9) looks very suspicious to me. As I understand, the > > following code could be executed from interrupt context, so barrier can’t > > be used here. Sorry if I’m wrong and this is always thread context. > > > > This is not the thread context, even the timer is interrupt context :) > Well, ihidev_intr migth be called from different places, at least from: 1. a device interrup 2. a timeout for the poll 3. ikbd_cngetc I think moving place where iic_intr_establish is called is a wrong way. The right way is confirm inside ihidev_intr that devices is fully attached, and don't try to poll anything from not attached devices. Also, you're right regarding timeout_del_barrier. I have replaced them to just timeout_del because this code is called from different contexts. So, resulted diff which boot on HONOR MagicBook Art 14 Snapdragon: Index: sys/dev/i2c/ihidev.c =================================================================== RCS file: /home/cvs/src/sys/dev/i2c/ihidev.c,v diff -u -p -r1.33 ihidev.c --- sys/dev/i2c/ihidev.c 18 Oct 2024 12:53:49 -0000 1.33 +++ sys/dev/i2c/ihidev.c 23 Dec 2024 15:54:16 -0000 @@ -108,6 +108,7 @@ ihidev_attach(struct device *parent, str int repid, repsz; int repsizes[256]; + sc->sc_attached = 0; sc->sc_tag = ia->ia_tag; sc->sc_addr = ia->ia_addr; sc->sc_hid_desc_addr = ia->ia_size; @@ -195,6 +196,8 @@ ihidev_attach(struct device *parent, str sc->sc_subdevs[repid] = (struct ihidev *)dev; } + sc->sc_attached = 1; + if (sc->sc_refcnt > 0) return; @@ -241,7 +244,7 @@ ihidev_activate(struct device *self, int if (sc->sc_poll && timeout_initialized(&sc->sc_timer)) { DPRINTF(("%s: cancelling polling\n", sc->sc_dev.dv_xname)); - timeout_del_barrier(&sc->sc_timer); + timeout_del(&sc->sc_timer); } if (ihidev_hid_command(sc, I2C_HID_CMD_SET_POWER, &I2C_HID_POWER_OFF)) @@ -639,11 +642,14 @@ ihidev_intr(void *arg) u_char *p; u_int rep = 0; + if (!sc->sc_attached) + return (0); + if (sc->sc_poll && !sc->sc_frompoll) { DPRINTF(("%s: received interrupt while polling, disabling " "polling\n", sc->sc_dev.dv_xname)); sc->sc_poll = 0; - timeout_del_barrier(&sc->sc_timer); + timeout_del(&sc->sc_timer); } /* Index: sys/dev/i2c/ihidev.h =================================================================== RCS file: /home/cvs/src/sys/dev/i2c/ihidev.h,v diff -u -p -r1.9 ihidev.h --- sys/dev/i2c/ihidev.h 3 Sep 2022 15:48:16 -0000 1.9 +++ sys/dev/i2c/ihidev.h 23 Dec 2024 15:54:23 -0000 @@ -86,6 +86,7 @@ struct ihidev_softc { u_int sc_isize; u_char *sc_ibuf; + int sc_attached; int sc_refcnt; int sc_poll; -- wbr, Kirill