Index | Thread | Search

From:
Martin Pieuchot <mpi@grenadille.net>
Subject:
Re: increase HOTPLUG_MAXEVENTS ? or handle differently upd(4) devices with many 'unclaimed' uhidev ?
To:
Landry Breuil <landry@openbsd.org>
Cc:
tech@openbsd.org
Date:
Mon, 9 Dec 2024 18:07:46 +0100

Download raw body.

Thread
On 09/12/24(Mon) 17:58, Landry Breuil wrote:
> Le Mon, Dec 09, 2024 at 11:21:01AM +0100, Martin Pieuchot a écrit :
> > On 07/12/24(Sat) 11:01, Landry Breuil wrote:
> > > hi,
> > > 
> > > so i have this EATON PRO 650 i'm tinkering with on my home gateway. On
> > > 7.6, if booting with the UPS plugged in, the port is disabled at boot:
> > > 
> > > /bsd: uhub0: device problem, disabling port 6
> > 
> > This is a known issue in our USB stack.  I don't know exactly what it
> > is.
> 
> looking deeper with USB_DEBUG set, at boot i get this:

Lovely.

> efifb at mainbus0 not configured
> usb_needs_explore: usb0: not exploring before first explore
> usb_needs_explore: usb0: not exploring before first explore
> usb_needs_explore: usb0: not exploring before first explore
> usb_needs_explore: usb0: not exploring before first explore
> usb_xfer_abort_thread: start
> usb_task_thread: start
> usbd_reset_port: port 6 reset done
> usb_needs_explore: usb0: not exploring before first explore
> usbd_new_device: bus=0xffff8000003e2000 port=6 depth=1 speed=2
> usbd_setup_pipe: dev=0xffff800000a05100 iface=0x0 ep=0xffff800000a05138 pipe=0xffff800000a05108
> usb_needs_explore: usb0: not exploring before first explore
> usb_free_device: 0xffff800000a05100
> uhub0: device problem, disabling port 6
> 
> my understanding is that usbd_setup_pipe in
> https://github.com/openbsd/src/blob/master/sys/dev/usb/usb_subr.c#L1091
> behaves fine (does it?), but i dont get to the debug printf in
> https://github.com/openbsd/src/blob/master/sys/dev/usb/usb_subr.c#L1144
> and between those, i only see usbd_reset_port() & usbd_get_desc() calls,
> with either USB_MAX_IPACKET or USB_DEVICE_DESCRIPTOR_SIZE, and more
> delays added between each.
> 
> i'll add more debug printfs to figure out which one does the 'goto
> fail', and if i can get more delay that would be enough for the device
> to properly attach/pass this step.

Yes, that's the way to do it.  I could never reproduce this bug reliably.
I'm very glad you can dig into this.

> > > more sensors and fix the broken RunTimeToEmpty one:
> > > - if i plug the UPS before BIOS POST, it refuses to boot (eg stays on blank
> > >   screen, not showing the lenovo POST screen)
> > > - if i plug the UPS after BIOS POST, during boot it detaches all uhid devices
> > >   because of a 'hotplug queue full', then reattaches them, and from that point
> > 
> > I doubt it is *because of* a 'hotplug queue full'.  This just indicates
> > that an hotplug message has been lost.
> > 
> > Is it when you see the "uhub0 device problem"?
> 
> i havent seen the hotplug queue full/detach/attach dance on the machine
> where i see the 'port disable' issue (my gateway, running 7.6), that's
> only on my t495s running current. havent tested other hosts (yet).
> 
> > > uhid0 at uhidev0 reportid 2: input=4, output=0, feature=4
> > > ...
> > > uhid29 at uhidev0 reportid 255: input=0, output=0, feature=63
> > > upd0: read 4 bytes for RunTimeToEmpty, got 524900, adjust=3906250, sensor val=2050390625000
> > > 
> > > bumping HOTPLUG_MAXEVENTS makes the UPS attach directly at boot on my laptop,
> > > but .. i'm not sure this is "right".
> > 
> > Which could indicate there is a timing issue involved.  Bumping this
> > value changes the behavior of hotplug_device_attach() and make it
> > execute more non-trivial code.
> 
> i would have rather thought that 'all those 29 uhid' triggered too many
> events (2 by uhid) that would hit the 64 limit, which would trigger the
> overflow, and the detach/reattach.

I understand but the error is not fatal.

> > Does the detach always happen at the same moment?  Could you investigate
> > where is there some delay missing?
> 
> i can try figuring out where the detach happens, but i would have
> thought that's a device-initiated detach (eg port reset from the device
> ?).  otherwise, what codepaths in the kernel could forcedly trigger/call
> a detach event ?
> 
> > > and i'm not sure either this will fix the 'disabling port 6' issue i'm seeing
> > > on my gateway.
> > 
> > They might be related.
> > 
> > > Thoughts ? diff below.
> > 
> > Bumping this might make sense if machines with more than 64 devices are
> > common.  However I'd argue that in the case of your upd(4), which
> > accounts for 32 devices, it would be better to support more deviceIDs
> > (sensors) and attach fewer uhid(4).
> 
> i'll have to recheck, but when i tried adding more random sensors from
> the various reportids in the initial diff at
> https://marc.info/?l=openbsd-tech&m=173175657305536&w=2, it didnt
> prevent the corresponding uhid(4) devices to attach.
> 
> my understanding is that upd(4) doesnt 'exclusively' claims them, so
> they're still available/attach as uhid, whether or not upd(4) "does
> something with the corresponding sensors".

In this case we should bump the value, ok with me.