[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RE: gb <==> big5 conversion module



hi all (sorry for writing in GB2312)
     Anthony Fok說上次我列出來的那些字有一部分是在big5+范圍之內的。:) 有些gb2312字符到big5有好多種寫法,這種情況只有以詞為單位轉換才
能解決。我現在正在做分詞的程序,還算順利,找到一些相關論文,已經寫出來一個原型。現在缺少一個gb2312<->big5的詞組對應表,gb2312的分詞
字典我現在用的是unicon-im裡面帶的詞組,big5的字典在xcin裡應該能找到。不過這些字典都沒有詞性 :(,沒辦法湊合用吧。我目前不打算在
autoconvert裡面調用iconv,因為不是所有平台都用glibc的。 :) 字表還是統一起來比較好,呵呵,等你的結果了。

					Yu Guanghui

> -----Original Message-----
> From: Anthony Fok [mailto:foka@master.debian.org]On Behalf Of Yong Li
> Sent: Sunday, January 14, 2001 4:54 PM
> To: debian-chinese-big5@lists.debian.org
> Subject: gb <==> big5 conversion module
>
>
> Hello T.H.Hsieh and Yu Guanghui,
> I just came back yesterday from a long vacation and found that I missed
> both your posts regarding gb <==> big5 conversion.
>
> Before left for vacation, I was also working on writing a gb <==> big5
> gconv module. The first part of my plan was to establish a "best" mapping
> between gb and big5. I did not take any existing conversion table because
> none of them documented how they got their conversions and I don't feel
> comfortable with that. So I roll my own and took this opportunity to check a
> few popular gb <==> big5 converters. Most of this work has been finished.
> All the gb -> big5 conversions have been checked, but there some big5 -> gb
> conversions left. The result so far looks good. Compare with the table of
> 130+ unmapped gb codes posted by Yu Guanghui a while ago, 35 of them are
> mapped in my table. There are 4 codes not mapped in my table, but mapped in
> autoconvert. However I suspect that autoconvert made mistake in all 4 cases.
> I'll write a more detailed post describing my methodology, conversion
> table and the comparison results in next few days. Then I'd like to hear
> from you. If we all agree upon it, it's fairly easy to write the module.
> Hopefully it will be in time for 2.2.1 release which is said to be soon.
>
> Regards,
> Yong Li
> (rigel)
>
> --
> | This message was re-posted from debian-chinese-gb@lists.debian.org
> | and converted from gb2312 to big5 by an automatic gateway.
>
>

-- 
| This message was re-posted from debian-chinese-gb@lists.debian.org
| and converted from gb2312 to big5 by an automatic gateway.



Reply to: