Bug#227273: packages.debian.org: charset mismatch (always in UTF-8?)
On Tue, Jan 13, 2004 at 10:18:07PM +0100, Denis Barbier wrote:
> On Tue, Jan 13, 2004 at 06:12:19PM +0900, Tomohiro KUBOTA wrote:
> > However, in the reality, the page is written in EUC-JP.
> > Because of this inconsistency, web browsers will render the page
> > by assuming the page is UTF-8 and the result will be the Mojibake.
> I do not know how packages.debian.org is generated, but this mismatch
It is rather simple. I download the files from the DDTP. They
include only Descriptions and the MD5SUM of the corresponding
English description. I read them in, convert them to another
charset (needed only for languages that have no UTF-8 locale),
add some HTML tags and write the to the pages.
Problems can only be caused by the
fact that I might assume a wrong encoding of the DDTP files or
if I do mistakes while adding the HTML tags.
Which one it is I try to find out currently.
One can obtain the used files from
Frank Lichtenheld <email@example.com>