Re: CJK workers, throw off you chains! (fwd)
On Fri, Feb 23, 2001 at 02:02:26AM -0700, rigel wrote:
> On Thu, Feb 22, 2001 at 11:14:41AM -0500, Thomas Chan wrote:
> > I don't know who else is interested in CCCII mappings, but there isn't
> > much time--3.1 will be released at the end of March. CCCII mappings are
> > also problematic, because source separation has been abandoned. (You can
> Do you know why it was abandoned. Is it believed that CCCII has been covered
> by the combination of other charsets?
> I personally am very interested to see how the 70195 han characters in
> unicode 3.1 compare out with 75684  in CCCII. Given that CCCII contains
> a lot variants, there's good possibility that unicode already has more
> hanzi than CCCII. It'll be interesting to see which CCCII codes are not
> covered yet.
> Although not exactly a fan of CCCII, I admire its well thoughted design.
> It will be useful to have a mapping between CCCII and unicode. A CCCII
> to CNS mapping will help some in this regard. Does anyone know such mapping
Unihan-3.txt is kinda a mapping. Unfortunately, it is far from complete.
What I really care is a good approach to construct "indices" for Chinese
documents (especially, books). CCDB(Chinese Characters DataBase) of CCCII
provides some promising ways to order Chinese characters. That's why I would
like to see a complete Unihan <-> CCCII mapping.
>  There are some ambiguity about how many characters are encoded in CCCII.
> According to Ken Lunde's CJKV Information Processing", the formal release
> version has 53940 hanzi, while the draft version contains 75684 which is
> the number I quoted. The book was published in 1999.
Could you tell me where to find the draft version containing 75684
characters??? And who maintains the CCCII standard now??? I really
want to know these answers.
Chia-Sheng Chang (Jonathan Chang)
Institute of Communications Engineering
College of Electrical Engineering and Computer Science
National Taiwan University
Taipei, Taiwan 10617, R.O.C.