[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: how to move to UTF-8 ? (was: An encoding problem)



On Thu, Jul 30, 2009 at 02:12:06PM +0200, Frans Pop wrote:
> Simon Paillard wrote:
> > On Wed, Jul 29, 2009 at 06:27:02PM +0200, Frans Pop wrote:
> > Could you please describe the steps you have performed and how ?
> 
> I actually used (sponge is from moreutils):
> $ for i in $(find -type f); do \
> 	iconv -f iso-8859-15 -t utf-8 $i | sponge $i; \
>   done

Didn't know about sponge, thanks.
 
> I then checked the result with 'cvs diff -u'. That showed some pages
> (incorrectly) already had utf-8 encoded chars, so I reverted those.

Right, this should be checked as well (a grep in the German part is simple
as German uses only 4 common 8 bit characters).
 
> It turned out that this mangled the generated $Date fields (2007-01-01
> had become 2007/01/01); I corrected that by doing (possibly not strictly
> necessary as the server would update them anyway on commit, but I wanted
> my diffs clean):

We had already in the past trouble because of this as the changed data field
resulted in conflicts in the working copy on www-master. That is probably a
CVS bug ...

> I also did a cleanup, replacing entities by encoded characters, e.g:
> $ for i in $(find -type f); do sed -ri "s/ä/ä/g" $i; done

Indeed this is useful and simplifies searches via grep in the source.
 
Jens


Reply to: