Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.

-=| victory, Thu, Jun 09, 2011 at 03:54:51AM +0900 |=-
> On Wed, 08 Jun 2011 10:15:41 -0400
> David Prévot wrote:
> > > For example:
> > > authors "Osamu Aoki (青木 修)">
> > > <maintainer "Osamu Aoki (&#38738;&#26408; &#20462;)">
> > > Can I make them into more readable UTF-8 strings:
> > > authors "Osamu Aoki (青木 修)">
> > > <maintainer "Osamu Aoki (青木 修)">
> > > If noone object, I will...
> > If you wish, but please, don't touch any file that code may be used
> > verbatim in other language (all of them are not in UTF-8 yet) since it
> > will brake some of them (e.g. don't touch the code that is used to
> > generate POT files, and most of the *.src or *.def files), and do update
> > translation check of every translation of English pages you will be
> > editing (./smart_change.pl could be handy for that), I personally don't
> > care if you do the same changes in translated language, but as you 
> > seem
> > to care, feel free to change them too if they are UTF-8 ready.
> -1;
> NOT all editors handle utf8 files correctly,

Is that so? Please file bugs (and use another editor in the mean 

> so it's NOT good to change all translations.
> apparently it will get into some trouble.
> - the editor i'm using does not break most of langs but does break some..

Which one is that? Would be nice to know to avoid it :)

> - well, assumed that the editor does break nothing,
>   but please think how to edit those.
>   I don't want to edit files which have strings I can't read/input

So you can read &#38738; but not 青? To me the first is a complete 
enigma, while the second is a far-eastern hieroglyph (which meaning is 
still unknown to me, but hey, one can't know everything).

>   such as accent'ed characters and russian, arabian, etc.

You shouldn't have to edit them. If you need to change a file 
containing unknown characters, just don't change the text in foreign 

