[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#668858: apache2: doesn't use UTF-8 by default



On Sunday 15 April 2012, Adam Borowski wrote:
> Apache is one of the few things in Debian not configured to use
> UTF-8 by default.  Considering that UTF-8 has been the default
> encoding for four releases already, GUI stuff doesn't really
> support ancient locales anymore and there's talk about dropping
> them from glibc as well, changing this in Apache is long overdue. 
> The transition to 2.4 seems like a good time for such a change.

First of all, there are many places in apache2 where the charset can 
play a role. It already uses UTF-8 for e.g. file names in auto-
generated directory listings. It would be good if you specified 
exactly what it should do to "use UTF-8".

> 
> It's a quite visible sore spot too: like half of even Debian's
> servers get the encoding wrong on text/plain files, mangling
> people's names, etc -- so even if the Debian crowd has trouble
> getting this right, what can be said about an average admin?
> 
> If every single file on the system is in UTF-8, I see not a single
> reason to use something else.

So, you think it should always send charset=UTF-8 in the Content-Type 
header? This is very problematic because it overrides the encoding 
that may be specified inside of *.html or *.xml files. And the claim 
that every single file on a Debian system is UTF-8 is simply not true:

$ find /usr/share/ -name \*html -type f|xargs grep -il iso.8859|wc -l
1721



Reply to: