Index | Thread | Search

From:
Martin Pieuchot <mpi@grenadille.net>
Subject:
Re: sys/uvideo: prevent crashing on parsing isco packeg when stream is closed
To:
tech@openbsd.org, mglocker@openbsd.org
Date:
Mon, 17 Mar 2025 10:49:25 +0100

Download raw body.

Thread
On 13/03/25(Thu) 19:01, Kirill A. Korinsky wrote:
> On Thu, 13 Mar 2025 10:13:58 +0100,
> Martin Pieuchot <mpi@grenadille.net> wrote:
> > 
> > On 13/03/25(Thu) 09:59, Kirill A. Korinsky wrote:
> > > tech@,
> > > 
> > > I had discovered one more crash which can be triggered by an application
> > > when it sends ioctl(VIDIOC_STREAMOFF) into streaming isco webcam, or
> > > when the device is detached.
> > > 
> > > To avoid such crash I'd like to add requirements that pipe is still
> > > open, before parse header, or schedule a new isoc transfer.
> > 
> > All these checks look like workarounds to me.  Could you describe what
> > is happening?  Would you mind sharing the panic trace?  What race is
> > this diff working around?
> > 
> 
> This is addresses this issue (text is OCRed but looks without too many typos):

What is the issue?  Double free?

>         ~ $ pkill ffplay
>         ~ $ video0 detached
>         uvideo0 detached
>         panic: uvm_fault failed: ffffff00003a9cd0 esr 96000004 far dead007fdead0f3?
>         Stopped at      panic+0x140:     cmp     w21, #0x0
>              TID     PID     UID   PRFLAGS    PFLAGS  CPU COMMAND
>           326064   47722      35      0x12         0    1 Xorg
>           522716   17327       0    0x14000     0x200    4 softclockhp
>         db_enter() at panic+0x13c
>         panic() at kdata_abort+0x170
>         kdata_abort() at handle_el1h_sync+0x68
>         handle_el1h_sync() at usbd_free_xfer+0x58
>         usbd_free_xfer() at uvideo_vs_free_isoc+0x48
>         uvideo_vs_free_isoc() at uvideo_close+0xc4
>         uvideo_close() at videoclose+0x64
> 
> I had run ffplay which stream from a webcam in X11, switched to console,
> deattached cam and tries to kill ffplay.
> 
> And system crash or frozen.

It indicates something is incorrect in the detach path.

> I do have a few similar workarounds which allows me to suceffuly deatach
> this webcam when it streams without crash or freeze system.
> 
> I had dig into wtf is going on, and seems that xhci controller starts to
> send really a lot of xfer packets when device is detached.
> 
> And when I had tried to kill ffplay... well, it leads to crashed.
> 
> It has a few places which I had, locally, workedaroud, and one of the
> notable example is https://marc.info/?l=openbsd-tech&m=174180759929893&w=2
> Without this, it calls video_stop after device is detached and it is blocked
> on mutex. I had logged an owner of the mutex a few times before it
> completley frozen, and it was 0x10000001e, 0x10000001d or similar value

This indicates that when a camera is detached other processes, like
ffplay(1), do not see it.  Instead it hangs in the kernel with a reference
to the detached driver.  Having a process/thread holding a reference to
an incomplete device descriptor is, IMHO, a bad design choice.  It will
force us to add workarounds in every code path were non-valid memory
could be dereferenced or I/O could be generated.

Could it be possible to make processes, like ffplay, abort their sleep
and return to userland as soon as a device is detached instead?

Could you please post the panic you've seen earlier with the mutex that
you prevented with the diff you mentioned above?  Instead of
refcounting the descriptor, which obviously leads to the current issues,
we might want to ensure all userland threads left the driver before
freeing its memory.  Do you think that's doable?