Bug#806959: apt-get decodes the Location header when redirecting, consequent GET request has an invalid URL
On Tue, Mar 22, 2016 at 01:09:30AM +0200, Yoav Landman wrote:
> > >
> > > according to the spec (linked in the above SO thread) the Location
> header should be sent URL encoded, hence decoding it is wrong..
> >
> > We dequoted the URI because the URI is quoted again afterwards. Not all
> characters
> > are quoted, though, but at least a %3B would be quoted as %253B.
> >
> > See commit c34ea12ad509cb34c954ed574a301c3cbede55ec and Bug#602412 for
> details.
>
> Hi,
>
> While it is understandable why the current logic makes this fix less
> trivial, I am concerned that the basic working assumptions behind this
> logic are broken.
It most certainly is.
>
> First, without understanding the URL parts before decoding, re-encoding
> (what is referred here as "quoting") back to the original form cannot be
> done is a symmetric way to give back the same semantics.
> Please see the following link for more details:
> https://www.talisman.org/~erlkonig/misc/lunatech%5Ewhat-every-webdev-must-know-about-url-encoding/#DecodedURLscannotbereencodedtothesameform
>
> Second, it is the HTTP 1.1 spec that mandates the Location header's URL to
> be encoded, so it is a reasonable expectation from the server standpoint
> for clients to treat the Location value as a legally encoded URL.
> The fix mentioned above is breaking this assumption by re-applying encoding
> in a way that is changing the original URL semantics.
True. It's also true that some servers send you relative location headers,
though, so it's not like everyone else is doing things the right way...
>
> >
> > I'd advise you to not use URIs involving percent-encoded characters or
> provide a patch
> > and a test case if you want to see this (actually minor) issue fixed.
> >
> > As a data point, all the official mirrors and redirectors work fine.
>
> Surely, one cannot avoid using legitimate encoded URLs as solution.
> With hundreds of User-Agent types, currently apt is the only agent that
> does not follow redirects as advised by the spec and we had to implement a
> server side workaround for it (basically not serving redirects).
> So, I'd urge the apt library to do the right thing and treat the Location
> header as an encoded URL ready to be used as is without further
> manipulation.
>
I'm not sure it is easily possible. Most of the URIs we deal with are
unencoded and we have to encode them as servers mess up otherwise. The
redirect case is a bit special here, as we wouldn't have to do that.
Support for redirects is fairly new too, it's only been in APT for about
7 years, and only became really usable less than 4 years ago.
I also don't have a test case that actually fails. So this might take
some time. I have an idea how to fix this: Set a flag where we store
the URI, but it's a bit complicated as the methods run in separate
processes and the data needs to be exchanged via text messages.
I also need to create failing test case first, I'm not sure if that's
possible with what we have currently. Without a test case, I wouldn't
know if it fixes the issue...
Let me have a go at this tomorrow (well, technically today in our
time zones), I don't have anything else planned.
--
Debian Developer - deb.li/jak | jak-linux.org - free software dev
When replying, only quote what is necessary, and write each reply
directly below the part(s) it pertains to (`inline'). Thank you.
Reply to: