[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#567781: Conversion of english pages to Unicode, via HTML entities.



* Charles Plessy <plessy@debian.org> [2011-05-17 02:03:42 CEST]:
> Le Mon, May 16, 2011 at 07:34:59PM +0200, Simon Paillard a écrit :
> > On Sun, May 15, 2011 at 10:24:48PM +0900, Charles Plessy wrote:
> > > 
> > > would it be welcome if I would start to replace iso-8859-1 characters
> > > by HTML entities using smart-change for the english language, in order
> > > to ease conversion to Unicode ?  As of today, there would be this
> > > number of files changed in the following directories.
> > [..]
> > 
> > No, I would even advice the other: remaining entities -> to the
> > coding used by each language.
> 
> Entities can be removed after the conversion, and I can help for this
> as well.

 Why entity in the first place and then switch it back? That would mean
an additional required bump of translation-check headers and whatsnot. I
don't see the benefit in this? Like Simon pointed out, it would make
e.g. proof reading approaches unneeded complicated. There is no use for
this in the aereas that are not already using entities.

 Also, "can be removed after the conversion" would be after the
conversion of _all_ languages because otherwise you would catch the
entities that are currently still needed.

> I would like the English pages to be converted to Unicode, and offered
> my help a couple of monthes ago.  I proposed to first go to the common
> denominator of iso-8859-1 and Unicode, which is ASCII plus entities,
> and then to switch encoding, and then to remove the entities.
> 
> I sent this to http://bugs.debian.org/567781#77 and I thought it was accepted
> by the WWW team after discussion on IRC:
> 
> http://meetbot.debian.net/debian-www/2011/debian-www.2011-02-15-21.30.html 

 In theory yes, help is appreciated and you are invited to help, but
please try to understand our reasoning on why we consider that
translating the 8bit characters to entities now, bumping all
translation-check headers, putting default for english to utf8, removing
entities and *again* bumping all translation-check headers, is not the
most useful approach.

 For pages not uptodate that means being left behind for two more
"updates", which might result in bigger warning, and also requires
additional care after the second conversion to not replace an entity
that isn't meant to be a direct utf8 character (yet).

> What are the other plans ?  If it is to have a massive overnight transition,
> given my timezone, you can probably count me out…

 One of the plans might be to do it in a work session during debcamp,
which is only two months away. If you like to help, please coordinate
with the people that already have done a conversion, and try to
understand their concerns.

 Enjoy,
Rhonda
-- 
Fühlst du dich mutlos, fass endlich Mut, los      |
Fühlst du dich hilflos, geh raus und hilf, los    | Wir sind Helden
Fühlst du dich machtlos, geh raus und mach, los   | 23.55: Alles auf Anfang
Fühlst du dich haltlos, such Halt und lass los    |



Reply to: