>>>>> "Roger" == Roger So <rogerso@sis.dhs.org> writes: Roger> So, given a stream of bytes which might contain multibyte Roger> characters, how would I test whether a byte is, say, printable? Roger> Do I need to test for MB_CUR_MIN to MB_CUR_MAX number of bytes Roger> instead of individual bytes? (seems wildly inefficient ...) Or is it that when MB_CUR_MAX!=1, you really shouldn't "check whether a byte is printable"? It doesn't make much sense to me: even if 0xA6 or 0xA7 appear as a byte in a BIG5 character, there is no guarantee that D8A7, say, is a valid BIG5! So it's neither "printable" nor "unprintable"... Roger> Also, in glibc, are widechars always in Unicode? (UCS-4?) I think so. Regards, Isaac. -- | This message was re-posted from debian-chinese-big5@lists.debian.org | and converted from big5 to gb2312 by an automatic gateway.
Attachment:
pgpz_WZQci2i5.pgp
Description: PGP signature