[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#397886: apache2.2-common: non wanted behaviour during upgrade: charset MUST not be created without user consent



severity 397886 important
(breaks something both valid and common, and that used to work)

Steinar H. Gunderson a écrit :

> Now, what you are probably thinking of is the following abomination:
> 
>   <head>
>     <meta http-equiv="Content-type" content="text/html; charset=iso-8859-15">
> </head>

This "abomination" :) is perfectly a valid one [1].

[1] http://www.w3.org/TR/html4/charset.html

> It is true that a Content-type: header with a character set will
> override this.

As per [1].

> However, using http-equiv is strongly discouraged in
> general, and has been so for years

Hmmm. Where in [1] (or another reference) exactly?

> -- after all, what character set would the browser assume for the
> <meta> tag?

This is described in [1] ("The META declaration must only be used when
the character encoding is organized such that ASCII-valued bytes stand
for ASCII characters").

> (And if you were serving non-HTML content, like plain text, how would
> you specify the character set information if not in the HTTP
> headers?)

Now this is a good point: as Debian Etch uses UTF-8 locales/charset by
default, it is indeed desirable to (explicitely) serve as UTF-8
plain/text files, which are likely to contain UTF-8 text.

But setting such a DefaultCharset *breaks* *working* pages (and
perfectly valid ones) for very little benefit. Sites that use latin
encoding for latin characters are *not* broken.

Here is what I can read in apache2.conf :

[...]
<IfModule mod_mime.c>
    #
    # Specify a default charset for all pages sent out. This is
    # always a good idea and opens the door for future internationalisation
    # of your web site, should you ever want it. Specifying it as
    # a default does little harm; as the standard dictates that a page
    # is in iso-8859-1 (latin1) unless specified otherwise i.e. you
    # are merely stating the obvious. There are also some security
    # reasons in browsers, related to javascript and URL parsing
    # which encourage you to always set a default char set.
    #
    #AddDefaultCharset ISO-8859-1
[...]

Note the ambivalent comment, and that AddDefaultCharset is eventually
not set.

BTW, why isn't /etc/apache2/conf.d/charset properly marked as a
conffile, or integrated in apache2.conf? Or why not ask the user a
debconf question? (not what I suggest, I prefer letting the www-admin
consciously set a DefaultCharset, if she so wishes).

Creating the file on the fly in the postinst script is a silent if not
hidden way to suddenly break the user's site.

Cheers,
-- 
Daniel Déchelotte
                  http://yo.dan.free.fr/



Reply to: