[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#227273: packages.debian.org: charset mismatch (always in UTF-8?)



On Tue, Jan 13, 2004 at 10:18:07PM +0100, Denis Barbier wrote:
> On Tue, Jan 13, 2004 at 06:12:19PM +0900, Tomohiro KUBOTA wrote:
> [...]
> > However, in the reality, the page is written in EUC-JP.
> > Because of this inconsistency, web browsers will render the page
> > by assuming the page is UTF-8 and the result will be the Mojibake.
> [...]
> 
> I do not know how packages.debian.org is generated, but this mismatch

It is rather simple. I download the files from the DDTP. They
include only Descriptions and the MD5SUM of the corresponding
English description. I read them in, convert them to another
charset (needed only for languages that have no UTF-8 locale),
add some HTML tags and write the to the pages. 

Problems can only be caused by the
fact that I might assume a wrong encoding of the DDTP files or
if I do mistakes while adding the HTML tags.

Which one it is I try to find out currently.
One can obtain the used files from
http://ftp.de.debian.org/debian-ddtp/dists/{stable,testing,unstable}/main/i18n/Translation-ja{,.gz,.bz2}

Gruesse,
-- 
Frank Lichtenheld <frank@lichtenheld.de>
www: http://www.djpig.de/



Reply to: