Re: How to convert Unicode numbers into proper utf8 text?

To: debian-user@lists.debian.org
Subject: Re: How to convert Unicode numbers into proper utf8 text?
From: T <mlist4suntong@yahoo.com>
Date: Wed, 18 Oct 2006 10:33:26 -0400
Message-id: <[🔎] pan.2006.10.18.14.33.23.285532@yahoo.com>
References: <[🔎] 4535EFAF.2030801@gmail.com>

On Wed, 18 Oct 2006 17:11:11 +0800, Jeff Zhang wrote:

> OOo 2.0.4 can export LaTeX file now, but East Asia text were converted
> into unicode numbers, like:
> ...
> \begin{document}
> [95EE?][7956?][5B97?][4E4B?][5FB7?][6CFD?][FF0C?][543E?][8EAB?][6240?][4EAB
> ?][8005?][FF0C?][662F?][5F53?][5FF5?][5176?][79EF?][7D2F?][4E4B?][96BE?][FF
> 1B?][95EE?][5B50?][5B59?][4E4B?][798F?][7949?][FF0C?][543E?][8EAB?][6240?][
> 8D3B?][8005?][FF0C?]
> [662F?][8981?][601D?][5176?][503E?][8986?][4E4B?][6613?][3002?]
> [3000?][3000?][FF0D?][FF0D?][300A?][83DC?][6839?][8C2D?][300B?]
> \end{document}
> 
> How to convert those unicode number(95EE, 7956, ...) into utf8 text?

perl -p000e 's/\n//g; s / \[([0-9a-f]{1,4}) \?\] / chr(hex($1)) /giex;'
             ^^^^^^^^

NB, if there is no the "s/\n//g", ie, removing all \n, then not all chars
are converted, eg. [4EAB, [FF, ...

If you can have OO output paragraphs in single line, then you don't need
it. 

cat $tf.tex | perl -p000e '...'

\begin{document}问祖宗之德泽，吾身所享者，是当念其积累之难；问子孙之福祉，吾身所贻者，是要思其倾覆之易。　　－－《菜根谭》\end{document}


-- 
Tong (remove underscore(s) to reply)
  http://xpt.sourceforge.net/

Reply to:

Follow-Ups:
- Re: How to convert Unicode numbers into proper utf8 text?
  - From: T <mlist4suntong@yahoo.com>

References:
- How to convert Unicode numbers into proper utf8 text?
  - From: Jeff Zhang <idealbsd@gmail.com>

Prev by Date: Re: AVG anti-virus
Next by Date: Re: How to convert Unicode numbers into proper utf8 text?
Previous by thread: How to convert Unicode numbers into proper utf8 text?
Next by thread: Re: How to convert Unicode numbers into proper utf8 text?
Index(es):
- Date
- Thread