Re: PinYin Standard
-----BEGIN PGP SIGNED MESSAGE-----
> Here is the wc on pinyin.cin I thinkI can first base on this to build a
> 20976 line include multiple chars input... so total singal char to
> pinyin don't
> really map all the gbk. Also the standard pronouncation of the word
> should include
> the tone 1-6 for cantonese 1-4 for mandarin...
> in this file you get more detail how it should look like....
> in pinyin.cin... may not be enought for my use... of course better then
> nothing :)
> Thanks alot of your input
AFAIK there is no database which contains all chinese chars... they are too
many and mostly not used at all. big5 and gb2312 contain the most common
ones and for daily life it's sufficient. Cantonese should only be availible
on the HKSCS extensions in the Unihan database, although they also use big5
and nowadays gb chars but have different pronounciation for them.
you also should keep in mind that the tone (1-5 for mandarin) sometimes
varies on the same character in relation to the character in front and
behind of it.
and if you wnat to include minority dialects and laguages, too, what about
taiwanese, hakka, etc. ?
If you want to do that you really need a real dictionary.
Arne Goetje <email@example.com>
(Spam catcher. Address might change in future!)
PGP/GnuPG key: 1024D/685D1E8C
Fingerprint: 2056 F6B7 DEA8 B478 311F 1C34 6E9F D06E 685D 1E8C
Key available at wwwkeys.pgp.net. Encrypted e-mail preferred.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
-----END PGP SIGNATURE-----
To UNSUBSCRIBE, email to firstname.lastname@example.org
with a subject of "unsubscribe". Trouble? Contact email@example.com
| This message was re-posted from firstname.lastname@example.org
| and converted from big5 to gb2312 by an automatic gateway.