[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#233831: marked as done (www.debian.org: 404 error for Accept-Language: * )



Your message dated Fri, 29 Sep 2006 22:52:18 +0200
with message-id <20060929205218.GH16617@gadget.maisel.enst-bretagne.fr>
and subject line no more 404 with Accept-Language: *
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--- Begin Message ---
Package: www.debian.org
Version: 20040220
Severity: minor

www.debian.org appears to have a problem when an asterisk is present
in the Accept-Language header of an incoming HTTP request.

When I do a HEAD /support without an Accept-Language header all is well:

 $ sed -e 's/^  //' -e 's/$/
/' <<==HERE |
 >   HEAD /support HTTP/1.1
 >   Host: www.debian.org
 >   Connection: close
 > 
 > ==HERE
 > nc www.debian.org 80
 HTTP/1.1 200 OK
 Date: Fri, 20 Feb 2004 04:47:06 GMT
 Server: Apache/1.3.26 (Unix) Debian GNU/Linux PHP/4.1.2 DAV/1.0.3
 Content-Location: support.en.html
 Vary: negotiate,accept-language
 TCN: choice
 Cache-Control: max-age=86400
 Expires: Sat, 21 Feb 2004 04:47:06 GMT
 Last-Modified: Tue, 17 Feb 2004 02:33:35 GMT
 ETag: "11301e9-3b67-40317d7f;403575e1"
 Accept-Ranges: bytes
 Content-Length: 15207
 Connection: close
 Content-Type: text/html
 Content-Language: en

But if I add Accept-Language: * I get a 404:

 $ sed -e 's/^  //' -e 's/$/
/' <<==HERE |
 >   HEAD /support HTTP/1.1
 >   Host: www.debian.org
 >   Accept-Language: *
 >   Connection: close
 > 
 > ==HERE
 > nc www.debian.org 80
 HTTP/1.1 404 Not Found
 Date: Fri, 20 Feb 2004 04:47:44 GMT
 Server: Apache/1.3.26 (Unix) Debian GNU/Linux PHP/4.1.2 DAV/1.0.3
 Content-Location: support.nb.html
 Vary: negotiate,accept-language
 TCN: choice
 Connection: close
 Content-Type: text/html; charset=iso-8859-1

Then again if I add actual languages to the Accept-Language header all
is well again:

 $ sed -e 's/^  //' -e 's/$/
/' <<==HERE |
 >   HEAD /support HTTP/1.1
 >   Host: www.debian.org
 >   Accept-Language: sv-FI, i-navajo, en-US
 >   Connection: close
 > 
 > ==HERE
 > nc www.debian.org 80
 HTTP/1.1 200 OK
 Date: Fri, 20 Feb 2004 05:04:55 GMT
 Server: Apache/1.3.26 (Unix) Debian GNU/Linux PHP/4.1.2 DAV/1.0.3
 Content-Location: support.en-us.html
 Vary: negotiate,accept-language
 TCN: choice
 Cache-Control: max-age=86400
 Expires: Sat, 21 Feb 2004 05:04:55 GMT
 Last-Modified: Tue, 17 Feb 2004 02:33:35 GMT
 ETag: "11301e9-3b67-40317d7f;403575e1"
 Accept-Ranges: bytes
 Content-Length: 15207
 Connection: close
 Content-Type: text/html
 Content-Language: en-us

If at this point I add ", *" to the Accept-Language header, it will
not break things.

(However, when I mistakenly used sv_FI and en_US as language tags, the
asterisk produced an error, whereas if I sent the same request without
the asterisk, I would get a 200 with what is apparently the Mother of
All Variants of this page, presumably English:

 $ sed -e 's/^  //' -e 's/$/
/' <<==HERE |
 >   HEAD /support HTTP/1.1
 >   Host: www.debian.org
 >   Accept-Language: sv_FI, i_navajo, en_US
 >   Connection: close
 > 
 > ==HERE
 > nc www.debian.org 80  |
 > egrep '^(HTTP|Content-Location)'
 HTTP/1.1 200 OK
 Content-Location: support.html

If I put back in the * as a catch-all, there's the 404 again:

 $ sed -e 's/^  //' -e 's/$/
/' <<==HERE |
 >   HEAD /support HTTP/1.1
 >   Host: www.debian.org
 >   Accept-Language: sv_FI, i_navajo, en_US, *
 >   Connection: close
 > 
 > ==HERE
 > nc www.debian.org 80  |
 > egrep '^(HTTP|Content-Location)'
 HTTP/1.1 404 Not Found
 Content-Location: support.nb.html

You'll notice that I'm taking the liberty to grep just the interesting
parts of the response here to keep this parenthesis shorter.)

The language tag "*" is explicitly allowed in RFC2616 section 14.4 to
mean any other language not already listed. Granted, passing it in on
its own is perhaps dubious.

Real-world case: The W3C link validator reports broken links (404s)
to many of the important pages on www.debian.org. Here is a discussion:
<http://thread.gmane.org/gmane.org.w3c.validator/3520>

/* era */

-- System Information
Debian Release: 3.0
Kernel Version: Linux there.afraid.org 2.2.20 #1 SMP Thu Nov 7 16:15:53 EET 2002 i586 unknown


--- End Message ---
--- Begin Message ---
Hello,

There is no more issue with Accept-Language: * on debian webservers :

Now both commands return '200 OK':
wget -S --header='Accept-Language: *' http://www.debian.org/support
wget -S --header='Accept-Language: fr, *' http://www.debian.org/support

Regards.

-- 
Simon Paillard

--- End Message ---

Reply to: