Download raw body.
xhci: recover halted endpoints on USB Transaction Errors
On 2026-04-19 18:37, Atanas Vladimirov wrote:
> Hi tech,
>
> On Supermicro X10/X11 boards (tested on X10SLL-F and X11) the emulated
> USB keyboard and mouse exposed by the BMC/iKVM stop working after a
> BMC reset until the host is rebooted.
>
> Reproducer: "Reset" button in the BMC web UI.
>
> When the device re-appears the HID's INTR IN endpoint answers every
> poll with a USB Transaction Error:
>
> xhci0: txerr? code 4 (with XHCI_DEBUG)
>
> Per xHCI r1.1 section 4.10.2.6 a Transaction Error completion leaves
> the endpoint in the Halted state. The current xhci_event_xfer_generic()
> just sets xfer->status = USBD_IOERROR and breaks, so every subsequent
> xfer queued on the pipe is silently dropped by the halted endpoint --
> the keyboard dies for good.
>
> The diff below does two things:
>
> 1) Treats XHCI_CODE_TXERR / XHCI_CODE_SPLITERR like XHCI_CODE_STALL
> and issues an async reset-ep, so the usb stack can restart the
> pipe on a clean endpoint.
>
> 2) Caps the number of consecutive TXERR-driven resets per pipe with
> a small counter in struct xhci_pipe (reset on any successful or
> short completion). After XHCI_TXERR_RETRIES failures the pipe
> is obviously wedged, so we complete the xfer with USBD_IOERROR
> and call usb_needs_reattach() -- the hub explore task then
> detaches the stuck device, resets the port and re-enumerates it.
> On these boards the BMC has stabilised by then and the device
> comes back in its proper topology (ATEN hub with the HID behind
> it) and the keyboard works again without a host reboot.
>
> Please note that I used AI to understand the problem. Tested the patch on
> two machines and it works for me. But I understand that it might be totally
> wrong and someone, more capable than me, might have a better approach.
>
> I'll be glad to provide more details or do some extra testing.
>
> Best wishes,
> Atanas
Hello,
Just a kind reminder here :)
If you think that this approach is wrong/bad, just let me know and I can open
a bug report to bugs@ with this context and ask for help there.
Best wishes,
Atanas
>
> Index: dev/usb/xhci.c
> ===================================================================
> --- dev/usb/xhci.c
> +++ dev/usb/xhci.c
> @@ -70,6 +70,7 @@ struct xhci_pipe {
> struct usbd_xfer *pending_xfers[XHCI_MAX_XFER];
> struct usbd_xfer *aborted_xfer;
> int halted;
> + unsigned int txerr_count;
> size_t free_trbs;
> int skip;
> #define TRB_PROCESSED_NO 0
> @@ -78,6 +79,8 @@ struct xhci_pipe {
> uint8_t trb_processed[XHCI_MAX_XFER];
> };
>
> +#define XHCI_TXERR_RETRIES 3
> +
> int xhci_reset(struct xhci_softc *);
> void xhci_suspend(struct xhci_softc *);
> int xhci_intr1(struct xhci_softc *);
> @@ -953,6 +956,7 @@ xhci_event_xfer_generic(struct xhci_softc *sc, struct
> usbd_xfer_isread(xfer) ?
> BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
> xfer->status = USBD_NORMAL_COMPLETION;
> + xp->txerr_count = 0;
> break;
> case XHCI_CODE_SHORT_XFER:
> /*
> @@ -977,12 +981,31 @@ xhci_event_xfer_generic(struct xhci_softc *sc, struct
> usbd_xfer_isread(xfer) ?
> BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
> xfer->status = USBD_NORMAL_COMPLETION;
> + xp->txerr_count = 0;
> break;
> case XHCI_CODE_TXERR:
> case XHCI_CODE_SPLITERR:
> DPRINTF(("%s: txerr? code %d\n", DEVNAME(sc), code));
> - xfer->status = USBD_IOERROR;
> - break;
> + /* Prevent any timeout to kick in. */
> + timeout_del(&xfer->timeout_handle);
> + usb_rem_task(xfer->device, &xfer->abort_task);
> +
> + /*
> + * A USB Transaction Error leaves the endpoint Halted
> + * (xHCI r1.1 4.10.2.6); reset it. If the endpoint
> + * keeps failing, ask the hub to re-enumerate the
> + * device rather than spinning forever.
> + */
> + if (++xp->txerr_count > XHCI_TXERR_RETRIES) {
> + xp->txerr_count = 0;
> + xfer->status = USBD_IOERROR;
> + usb_needs_reattach(xfer->device);
> + break;
> + }
> + xp->halted = USBD_IOERROR;
> + xp->aborted_xfer = xfer;
> + xhci_cmd_reset_ep_async(sc, slot, dci);
> + return (1);
> case XHCI_CODE_STALL:
> case XHCI_CODE_BABBLE:
> DPRINTF(("%s: babble code %d\n", DEVNAME(sc), code));
> @@ -1623,6 +1646,7 @@ xhci_pipe_init(struct xhci_softc *sc, struct usbd_pip
>
> xp->free_trbs = xp->ring.ntrb;
> xp->halted = 0;
> + xp->txerr_count = 0;
>
> sdev->pipes[xp->dci - 1] = xp;
xhci: recover halted endpoints on USB Transaction Errors