[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#808953: xhci regression for large transfers (commit e210c422b)



On Fri, Jan 08, 2016 at 04:10:09PM +0200, Mathias Nyman wrote:
> On 07.01.2016 20:47, Ron wrote:
> >On Fri, Jan 08, 2016 at 03:09:20AM +1030, Ron wrote:
> >>On Fri, Jan 08, 2016 at 02:52:28AM +1030, Ron wrote:
> >>>On Thu, Jan 07, 2016 at 05:38:09PM +0200, Mathias Nyman wrote:
> >>>>Hi
> >>>>
> >>>>On 02.01.2016 08:32, Ron wrote:
> >>>>>
> >>>>>Hi,
> >>>>>
> >>>>>It appears the commit e210c422b6fdd2dc123bedc588f399aefd8bf9de
> >>>>>"xhci: don't finish a TD if we get a short transfer event mid TD"
> >>>>>is causing transfers larger than 16kB to be unreliable.
> >>>>>
> >>>>>If I limit transfers to be no larger than 16kB, then it also works as
> >>>>>expected in an XHCI port with an unmodified build of Linus' current
> >>>>>head (v4.4-rc7-76-g9c982e8), but transfers larger than that do not.
> >>>>>I see an alternating cycle of a successful transfer, followed by two
> >>>>>that will time out waiting in libusb (with a 5 second timeout set),
> >>>>>before getting another successful transfer and the cycle repeating.
> >>>>>
> >>>>>I can run more tests and dig into this deeper if the reason for it
> >>>>>isn't immediately obvious in hindsight.
> >>>>>
> >>>>
> >>>>Thanks for the info,
> >>>>I can't spot anything obvious, but my brain might still be in vacation mode.
> >>>>
> >>>>Could you reproduce it with the attached patch, it only adds extra debugging?
> >>>>
> >>>>We should either see no output, or the following sequence:
> >>>>
> >>>>  1. "mid bulk/intr SP, wait for last TRB event"
> >>>>  2. "last trb has length set"
> >>>>  3. "and last trb is SHORT_TX, OK"
> >>>
> >>>
> >>>I guess one out of 3 ain't good ...  all I see logged is:
> >>>
> >>>  [   60.015708] xhci_hcd 0000:04:00.0: mid bulk/intr SP, wait for last TRB event
> >>>  [   65.017374] xhci_hcd 0000:04:00.0: mid bulk/intr SP, wait for last TRB event
> >>>  [   70.455451] xhci_hcd 0000:04:00.0: mid bulk/intr SP, wait for last TRB event
> >>>  [   75.456248] xhci_hcd 0000:04:00.0: mid bulk/intr SP, wait for last TRB event
> >>>
> >>>I'm passing 5 seconds to libusb as the requested timeout.
> >>
> >>And if I limit the maximum transfer size to 16kB (the above was with
> >>64kB transfers), then I see nothing logged at all.
> >>
> >>So if that code does indeed look sane, perhaps the issue is really
> >>in the code that's splitting large transfers doing something funny?
> >
> >So, tracing this a bit deeper, what I'm seeing is that in the case of
> >a transfer of 16kB or less, there's only one TRB in the TD, so it's
> >always the last_trb and the wait code never triggers (ie. basically
> >the same code path as with this patch reverted).
> >
> >In the case of a larger transfer, there are N / 16kB TRBs in the TD,
> >and in the case where it stalls, I get a short transfer on the first
> >TRB, and then it just hangs waiting until it times out, reporting:
> >
> >  xhci_hcd ep 0x81 - asked for 65536 bytes, 16382 bytes untransferred
> >
> >with event_trb == td->first_trb.
> >
> >
> >What's still not clear to me yet is which assumption is broken.
> >Clearly we aren't actually getting more TRB events after the short
> >transfer.
> >
> >And the other thing which possibly makes this odd is that the FTDI will
> >always return 2 bytes in a packet, even when there "no data" in it,
> >since it prefixes each packet with 2 "modem status" bytes.  Which means
> >the 16382 untransferred above is it returning 2 status bytes and none
> >of the data that we actually requested from it (which is a normal thing
> >for it to do sometimes, and if we issue another read the data will
> >eventually come - but I don't yet know if that's also contributing here).
> >
> >
> >Sorry for the trickle of small facts, I'm still wrapping my head around
> >everything I need to know to understand this properly.
> >
> 
> This is all good information.
> 
> We mark all TRBs for incoming bulk transfers as "interrupt on short transfer"
> and the last TRB of the TD as "interrupt on completion"
> 
> If the we get a short packet event mid TD xhc should create two events, the first
> one for the trb that was short, and a second one one the last trb, with the "interrupt on completion"

There is one thing we need to be careful of here, with the FTDI at least.
Once we get a short packet, we do actually need to end the transaction
there and not append any further data to it.  The reason for that is
those modem status bytes.  The only way to find them and filter them from
the actual data you want is to split the received transfer back into
wMaxPacketSize'd blocks and remove the first 2 bytes from each packet,
since it prepends them to every packet.

If a subsequent trb after the short was to add more data to that, we'd
lose that framing and not be able to extract the data from it reliably
(since we wouldn't know where the short in the middle ended).

I'm not sure if the events we expect in this case already cover that,
but it seemed worth mentioning in case they might return more data.

> There was an errata added to the specs specifying in more detail what this second events status should be,
> but the second event should be there anyways. controller will "fast forward" through the remains of the TD after
> a short packet, generating the event.

If I understand that part though, then we shouldn't actually get more
data, just a final completion event for the whole TD ...

> It's possible that something goes wrong in updating the dequeue pointer of either the endpoint ring or
> event ring, leading to handling the wrong event, or wrong trb for that endpoint (perhaps handling the same "mid TD" TRB twice?)
> 
> I have yet another patch. This one prints out the content of the short transfer event, and the event following a short transfer.

Ok, here's a longer dump of what that outputs, which I'll annotate to
show what's actually going on with the device at each point.

The extra values in parens of the "asked for" lines are
(td->last_trb - td->first_trb), and (event_trb - td->first_trb)
which was how I tracked which trb it was stalling on earlier.


First we do a bunch of control transfers to configure the device:

 [264.959659] @00000000d07c8d50 21f37760 00000001 0d0000fb 01018001 
 [264.959785] @00000000d07c8d60 21f37770 00000001 01000000 01018001 last td short
 [264.961182] @00000000d07c8d70 21f37790 00000001 0d0000e9 01018001 
 [264.961282] @00000000d07c8d80 21f377a0 00000001 01000000 01018001 last td short
 [264.961682] @00000000d07c8d90 21f377c0 00000001 0d0000fb 01018001 
 [264.961781] @00000000d07c8da0 21f377d0 00000001 01000000 01018001 last td short
 [264.963044] @00000000d07c8db0 21f377f0 00000001 0d0000eb 01018001 
 [264.963158] @00000000d07c8dc0 21f37800 00000001 01000000 01018001 last td short
 [264.963536] @00000000d07c8dd0 21f37820 00000001 0d0000fb 01018001 
 [264.963640] @00000000d07c8de0 21f37830 00000001 01000000 01018001 last td short
 [264.964661] @00000000d07c8df0 21f37850 00000001 0d0000f1 01018001 
 [264.964782] @00000000d07c8e00 21f37860 00000001 01000000 01018001 last td short

Then we do a bunch of small transfer requests, to ensure its internal buffers
are really empty, and that it's really in the mode we asked for and responding
as expected.  These all succeed (including returning nothing as expected) and
don't get stuck waiting.

 [264.968439] @00000000d07c8e30 d207c9b0 00000000 0d001ffe 01038001 last td short
 [264.968452] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [264.970774] @00000000d07c8e40 d207c9c0 00000000 0d001ffe 01038001 last td short
 [264.970786] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [264.973808] @00000000d07c8e50 d207c9d0 00000000 0d001ffe 01038001 last td short
 [264.973821] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [264.976809] @00000000d07c8e60 d207c9e0 00000000 0d001ffe 01038001 last td short
 [264.976822] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [264.979809] @00000000d07c8e70 d207c9f0 00000000 0d001ffe 01038001 last td short
 [264.979823] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [264.982784] @00000000d07c8e80 d207ca00 00000000 0d001ffe 01038001 last td short
 [264.982795] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [264.985809] @00000000d07c8e90 d207ca10 00000000 0d001ffe 01038001 last td short
 [264.985822] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [264.988809] @00000000d07c8ea0 d207ca20 00000000 0d001ffe 01038001 last td short
 [264.988824] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [264.991809] @00000000d07c8eb0 d207ca30 00000000 0d001ffe 01038001 last td short
 [264.991823] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [264.994789] @00000000d07c8ec0 d207ca40 00000000 0d001ffe 01038001 last td short
 [264.994803] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [265.047603] @00000000d07c8f50 d207ca50 00000000 0d0001fe 01038001 last td short
 [265.047616] ep 0x81 - asked for 512 bytes, 510 bytes untransferred (0 0)
 [265.047690] @00000000d07c8f60 d207ca60 00000000 0d0001fe 01038001 last td short
 [265.047713] ep 0x81 - asked for 512 bytes, 510 bytes untransferred (0 0)
 [265.047754] @00000000d07c8f70 d207ca70 00000000 0d0001fe 01038001 last td short
 [265.047764] ep 0x81 - asked for 512 bytes, 510 bytes untransferred (0 0)
 [265.049817] @00000000d07c8f80 d207ca80 00000000 0d0001fc 01038001 last td short
 [265.049830] ep 0x81 - asked for 512 bytes, 508 bytes untransferred (0 0)
 [265.050004] @00000000d07c8fa0 d207ca90 00000000 0d0001fc 01038001 last td short
 [265.050011] ep 0x81 - asked for 512 bytes, 508 bytes untransferred (0 0)
 [265.080289] @00000000d07c8fc0 d207caa0 00000000 0d001ffe 01038001 last td short
 [265.080302] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [265.080361] @00000000d07c8fd0 d207cab0 00000000 0d001ffe 01038001 last td short
 [265.080368] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [265.082788] @00000000d07c8fe0 d207cac0 00000000 0d001ffe 01038001 last td short
 [265.082802] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [265.085823] @00000000d07c8ff0 d207cad0 00000000 0d001ffe 01038001 last td short
 [265.085836] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [265.088813] @00000000d07c8000 d207cae0 00000000 0d001ffe 01038000 last td short
 [265.088821] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [265.091814] @00000000d07c8010 d207caf0 00000000 0d001ffe 01038000 last td short
 [265.091821] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [265.094781] @00000000d07c8020 d207cb00 00000000 0d001ffe 01038000 last td short
 [265.094788] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [265.097814] @00000000d07c8030 d207cb10 00000000 0d001ffe 01038000 last td short
 [265.097821] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [265.100823] @00000000d07c8040 d207cb20 00000000 0d001ffe 01038000 last td short
 [265.100837] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)
 [265.103825] @00000000d07c8050 d207cb30 00000000 0d001ffe 01038000 last td short
 [265.103839] ep 0x81 - asked for 8192 bytes, 8190 bytes untransferred (0 0)


Then we try to start reading bulk data.  We want 64kB of 'real data' and we
cap the transfer request size at 64kB, which means because of the 2 byte
per packet overhead, we need at least 2 reads to get all the 'real data'.


So here, we request a 64kB transfer, and get all of it with no delay, then
request one more packet for the remainder, which we expect to be short and
contains the rest of the real data.

 [265.315125] @00000000d07c8070 d207cb70 00000000 01000000 01038000 last td short
 [265.315923] @00000000d07c8080 d207cb80 00000000 0d0000fe 01038000 
 [265.315937] ep 0x81 - asked for 512 bytes, 254 bytes untransferred (0 0)

The same request repeated, which also looks like it succeeds.

 [265.527263] @00000000d07c80a0 d207cbc0 00000000 01000000 01038000 last td short
 [265.528053] @00000000d07c80b0 d207cbd0 00000000 0d0000fe 01038000 
 [265.528067] ep 0x81 - asked for 512 bytes, 254 bytes untransferred (0 0)

And then the next request (which is otherwise the same on our side) blocks
waiting until it times out.

 [265.542997] @00000000d07c80d0 d207cbe0 00000000 0d003ffe 01038000 last td short
 [265.543010] ep 0x81 - asked for 65536 bytes, 16382 bytes untransferred (3 0)
 [265.543016] mid bulk/intr SP, wait for last TRB event 2

 [270.543961] @00000000d07c80f0 d207cc20 00000000 0d003ffe 01038000 last td short
 [270.543974] ep 0x81 - asked for 65536 bytes, 16382 bytes untransferred (3 0)
 [270.543980] mid bulk/intr SP, wait for last TRB event 2

After a couple of timeouts, we get another successful transfer, and this is
the cycle we see repeating.

 [275.752701] @00000000d07c8110 d207cc90 00000000 01000000 01038000 last td short
 [275.753484] @00000000d07c8120 d207cca0 00000000 0d0000fe 01038000 
 [275.753497] ep 0x81 - asked for 512 bytes, 254 bytes untransferred (0 0)

 [275.964817] @00000000d07c8140 d207cce0 00000000 01000000 01038000 last td short
 [275.965618] @00000000d07c8150 d207ccf0 00000000 0d0000fe 01038000 
 [275.965633] ep 0x81 - asked for 512 bytes, 254 bytes untransferred (0 0)

 [275.975846] @00000000d07c8170 d207cd00 00000000 0d003ffe 01038000 last td short
 [275.975852] ep 0x81 - asked for 65536 bytes, 16382 bytes untransferred (3 0)
 [275.975853] mid bulk/intr SP, wait for last TRB event 2

 [280.976790] @00000000d07c8190 d207cd40 00000000 0d003ffe 01038000 last td short
 [280.976804] ep 0x81 - asked for 65536 bytes, 16382 bytes untransferred (3 0)
 [280.976810] mid bulk/intr SP, wait for last TRB event 2

 [286.185537] @00000000d07c81b0 d207cdb0 00000000 01000000 01038000 last td short
 [286.186324] @00000000d07c81c0 d207cdc0 00000000 0d0000fe 01038000 
 [286.186337] ep 0x81 - asked for 512 bytes, 254 bytes untransferred (0 0)

 [286.397800] @00000000d07c81e0 d207ce00 00000000 01000000 01038000 last td short
 [286.398566] @00000000d07c81f0 d207ce10 00000000 0d0000fe 01038000 
 [286.398580] ep 0x81 - asked for 512 bytes, 254 bytes untransferred (0 0)

 [286.414933] @00000000d07c8210 d207ce20 00000000 0d003ffe 01038000 last td short
 [286.414938] ep 0x81 - asked for 65536 bytes, 16382 bytes untransferred (3 0)
 [286.414940] mid bulk/intr SP, wait for last TRB event 2

 [291.415870] @00000000d07c8230 d207ce60 00000000 0d003ffe 01038000 last td short
 [291.415883] ep 0x81 - asked for 65536 bytes, 16382 bytes untransferred (3 0)
 [291.415888] mid bulk/intr SP, wait for last TRB event 2

 [296.624642] @00000000d07c8250 d207ced0 00000000 01000000 01038000 last td short
 [296.625449] @00000000d07c8260 d207cee0 00000000 0d0000fe 01038000 
 [296.625463] ep 0x81 - asked for 512 bytes, 254 bytes untransferred (0 0)

 [296.836841] @00000000d07c8280 d207cf20 00000000 01000000 01038000 last td short
 [296.837636] @00000000d07c8290 d207cf30 00000000 0d0000fe 01038000 
 [296.837650] ep 0x81 - asked for 512 bytes, 254 bytes untransferred (0 0)

 [296.846742] @00000000d07c82b0 d207cf40 00000000 0d003ffe 01038000 last td short
 [296.846748] ep 0x81 - asked for 65536 bytes, 16382 bytes untransferred (3 0)
 [296.846749] mid bulk/intr SP, wait for last TRB event 2


  Ron


> From 2551cef961bdf1ca0e647b41bf4dc0d71e514315 Mon Sep 17 00:00:00 2001
> From: Mathias Nyman <mathias.nyman@linux.intel.com>
> Date: Fri, 8 Jan 2016 15:56:12 +0200
> Subject: [PATCH] xhci: FOR TESTING ONLY, display event trb for short packet
>  related events
> 
> display content of the event trb in case of short packets events, or
> in case the event for this endpoint follows a short packet event.
> 
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> ---
>  drivers/usb/host/xhci-ring.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index ed6ac7e..daabcfe 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -2342,6 +2342,20 @@ static int handle_tx_event(struct xhci_hcd *xhci,
>  	event_dma = le64_to_cpu(event->buffer);
>  	trb_comp_code = GET_COMP_CODE(le32_to_cpu(event->transfer_len));
>  	/* Look for common error cases */
> +
> +	if (trb_comp_code == COMP_SHORT_TX ||
> +	    ep_ring->last_td_was_short) {
> +		xhci_warn(xhci, "@%016llx %08x %08x %08x %08x %s\n",
> +			  (unsigned long long) xhci_trb_virt_to_dma(
> +				  xhci->event_ring->deq_seg,
> +				  xhci->event_ring->dequeue),
> +			  lower_32_bits(le64_to_cpu(event->buffer)),
> +			  upper_32_bits(le64_to_cpu(event->buffer)),
> +			  le32_to_cpu(event->transfer_len),
> +			  le32_to_cpu(event->flags),
> +			  ep_ring->last_td_was_short ? "last td short" : "");
> +	}
> +
>  	switch (trb_comp_code) {
>  	/* Skip codes that require special handling depending on
>  	 * transfer type
> -- 
> 1.9.1


Reply to: