Bug#233831: www.debian.org: 404 error for Accept-Language: *
Package: www.debian.org
Version: 20040220
Severity: minor
www.debian.org appears to have a problem when an asterisk is present
in the Accept-Language header of an incoming HTTP request.
When I do a HEAD /support without an Accept-Language header all is well:
$ sed -e 's/^ //' -e 's/$/
/' <<==HERE |
> HEAD /support HTTP/1.1
> Host: www.debian.org
> Connection: close
>
> ==HERE
> nc www.debian.org 80
HTTP/1.1 200 OK
Date: Fri, 20 Feb 2004 04:47:06 GMT
Server: Apache/1.3.26 (Unix) Debian GNU/Linux PHP/4.1.2 DAV/1.0.3
Content-Location: support.en.html
Vary: negotiate,accept-language
TCN: choice
Cache-Control: max-age=86400
Expires: Sat, 21 Feb 2004 04:47:06 GMT
Last-Modified: Tue, 17 Feb 2004 02:33:35 GMT
ETag: "11301e9-3b67-40317d7f;403575e1"
Accept-Ranges: bytes
Content-Length: 15207
Connection: close
Content-Type: text/html
Content-Language: en
But if I add Accept-Language: * I get a 404:
$ sed -e 's/^ //' -e 's/$/
/' <<==HERE |
> HEAD /support HTTP/1.1
> Host: www.debian.org
> Accept-Language: *
> Connection: close
>
> ==HERE
> nc www.debian.org 80
HTTP/1.1 404 Not Found
Date: Fri, 20 Feb 2004 04:47:44 GMT
Server: Apache/1.3.26 (Unix) Debian GNU/Linux PHP/4.1.2 DAV/1.0.3
Content-Location: support.nb.html
Vary: negotiate,accept-language
TCN: choice
Connection: close
Content-Type: text/html; charset=iso-8859-1
Then again if I add actual languages to the Accept-Language header all
is well again:
$ sed -e 's/^ //' -e 's/$/
/' <<==HERE |
> HEAD /support HTTP/1.1
> Host: www.debian.org
> Accept-Language: sv-FI, i-navajo, en-US
> Connection: close
>
> ==HERE
> nc www.debian.org 80
HTTP/1.1 200 OK
Date: Fri, 20 Feb 2004 05:04:55 GMT
Server: Apache/1.3.26 (Unix) Debian GNU/Linux PHP/4.1.2 DAV/1.0.3
Content-Location: support.en-us.html
Vary: negotiate,accept-language
TCN: choice
Cache-Control: max-age=86400
Expires: Sat, 21 Feb 2004 05:04:55 GMT
Last-Modified: Tue, 17 Feb 2004 02:33:35 GMT
ETag: "11301e9-3b67-40317d7f;403575e1"
Accept-Ranges: bytes
Content-Length: 15207
Connection: close
Content-Type: text/html
Content-Language: en-us
If at this point I add ", *" to the Accept-Language header, it will
not break things.
(However, when I mistakenly used sv_FI and en_US as language tags, the
asterisk produced an error, whereas if I sent the same request without
the asterisk, I would get a 200 with what is apparently the Mother of
All Variants of this page, presumably English:
$ sed -e 's/^ //' -e 's/$/
/' <<==HERE |
> HEAD /support HTTP/1.1
> Host: www.debian.org
> Accept-Language: sv_FI, i_navajo, en_US
> Connection: close
>
> ==HERE
> nc www.debian.org 80 |
> egrep '^(HTTP|Content-Location)'
HTTP/1.1 200 OK
Content-Location: support.html
If I put back in the * as a catch-all, there's the 404 again:
$ sed -e 's/^ //' -e 's/$/
/' <<==HERE |
> HEAD /support HTTP/1.1
> Host: www.debian.org
> Accept-Language: sv_FI, i_navajo, en_US, *
> Connection: close
>
> ==HERE
> nc www.debian.org 80 |
> egrep '^(HTTP|Content-Location)'
HTTP/1.1 404 Not Found
Content-Location: support.nb.html
You'll notice that I'm taking the liberty to grep just the interesting
parts of the response here to keep this parenthesis shorter.)
The language tag "*" is explicitly allowed in RFC2616 section 14.4 to
mean any other language not already listed. Granted, passing it in on
its own is perhaps dubious.
Real-world case: The W3C link validator reports broken links (404s)
to many of the important pages on www.debian.org. Here is a discussion:
<http://thread.gmane.org/gmane.org.w3c.validator/3520>
/* era */
-- System Information
Debian Release: 3.0
Kernel Version: Linux there.afraid.org 2.2.20 #1 SMP Thu Nov 7 16:15:53 EET 2002 i586 unknown
Reply to: