Download raw body.
sys/uvideo: avoid using queue.h
On 19/03/25(Wed) 23:27, Kirill A. Korinsky wrote:
> On Wed, 19 Mar 2025 22:47:21 +0100,
> Marcus Glocker <marcus@nazgul.ch> wrote:
> >
> > On Wed, Mar 19, 2025 at 09:04:34PM GMT, Marcus Glocker wrote:
> >
> > > On Wed, Mar 19, 2025 at 11:35:09AM GMT, Martin Pieuchot wrote:
> > >
> > > > There are two issues here. The first one is that uvideo_detach()
> > > > doesn't "close" the USB endpoints which should abort the transfers. I'd
> > > > suggest uvideo_detach() calls uvideo_close().
> > > >
> > > > The second issue is that uvideo_close() doesn't close the pipe due to
> > > > the wrong isdying() check. This is what aborts transfers and stop the
> > > > callbacks.
> > > >
> > > > Here's an untested diff that illustrates all of this. Note that the
> > > > kernel might now call uvideo_close() twice and some pointers might have
> > > > to be cleaned.
> > > >
> > > > The "dying" check should be avoided as much as possible. I don't think
> > > > it is necessary in the callback if the pipe is aborted during the close
> > > > operation. Btw usbd_close_pipe() implies usbd_abort_pipe().
> > >
> > > Thanks for the diff!
> > >
> > > I don't get an uvm fault panic anymore, but I still can manage to
> > > freeze the system when using ffplay and detaching the device during
> > > that in X. I also tried to remove both usbd_delay_ms() for a test,
> > > but makes no difference. I currently can't see what is happening here.
> > >
> > > We anyway could remove the usbd_delay_ms() in detach, since the only
> > > purpose was to get around this issue by naively waiting for
> > > uvideo_vs_cb() to complete, which doesn't work out very well we know
> > > now ;-)
> > >
> > > Needs some more deep dive it looks like ...
> >
> > I was playing a little bit puzzle with both of your diffs to understand
> > which part fixes what.
> >
> > With the attached diff I can't provoke an uvm fault panic, nor a freeze
> > anymore. While mpi's@ diff was taking care about the uvm fault panic,
> > kirill's@ diff was fixing the freeze.
> >
> > What did I change:
> >
> > mpi@ diff [1]:
> >
> > - I removed the usbd_delay_ms() from uvideo_detach(), since now it's
> > simple not required anymore.
> > - I removed the usbd_is_dying() comment from uvideo_vs_cb(), since it
> > seems to be enough to check for the usb xfer status.
> >
> > kirill@ diff [2]:
> >
> > - I only took over the videoioctl() peace to understand where ffplay
> > went crazy. I'm not sure if ffplay is just keep looping over
> > ioctl(). device_lookup() seems to be useful, since it does additional
> > checks to find out if the device is still active, like DVF_ACTIVE, as
> > already stated by kirill@ in another mail. I'm not sure about
> > device_unref(), but audio(4) uses it as well.
> >
> > In case testing for others would be successful too, I think we should
> > commit the fixes in separate parts. The uvideo(4) diff for fixing
> > the uvm fault panic, the video(4) diff to fix the freeze. I'm
> > potentially fine with kirill's@ full video(4) diff [2], but would like
> > to have mpi@'s feedback on that as well.
> >
> > [1] https://marc.info/?l=openbsd-tech&m=174238030006419&w=2
> > [2] https://marc.info/?l=openbsd-tech&m=174180759929893&w=2
> >
>
> I had tried your short diff and it works, but after multiple attach and
> detach of webcam (I haven't count, but it was something like few dozen
> times) the system simple crashed with stacktrace:
>
> panic: attempt to access user address 0x28 from EL1
> Stopped at panic+0x140: cmp w21, #0x0
> TID PID UID PRFLAGS PFLAGS CPU COMMAND
> *123765 9168 0 0x14000 0x200 1% usbtask
> db_enter() at panic+0x13c
> panic() at kdata_abort+0x180
> do_el0_sync() at handle_el1h_sync+0x68
> handle_el1h_sync() at uvideo_vs_close+0x20
> uvideo_vs_close() at uvideo_close+0x20
Which instruction is that? Which NULL pointer is being dereferenced?
The offset of 0x28 suggests that `sc_vs_cur' is NULL. Could it be that
uvideo_attach() didn't finish without error and that detaching such
device is busted?
> uvideo_close() at uvideo_detach+0x40
> uvideo_detach() at config_detach+0x160
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports. Insufficient info makes it difficult to find and fix bugs.
> ddb{1}>
That means the detach thread called uvideo_close() on an incomplete
descriptor.
Please make sure to execute "ps" when you see such backtrace it is helpful
to understand if another thread is currently sleeping inside uvideo(4) code
and where.
What did you do in your test? Was the /dev/video* being closed at the
same time you detached it? What it the kind of race that you were
triggering?
sys/uvideo: avoid using queue.h