>>>>> "Roger" == Roger So <rogerso@sis.dhs.org> writes: Roger> So, given a stream of bytes which might contain multibyte Roger> characters, how would I test whether a byte is, say, printable? Roger> Do I need to test for MB_CUR_MIN to MB_CUR_MAX number of bytes Roger> instead of individual bytes? (seems wildly inefficient ...) Or is it that when MB_CUR_MAX!=1, you really shouldn't "check whether a byte is printable"? It doesn't make much sense to me: even if 0xA6 or 0xA7 appear as a byte in a BIG5 character, there is no guarantee that D8A7, say, is a valid BIG5! So it's neither "printable" nor "unprintable"... Roger> Also, in glibc, are widechars always in Unicode? (UCS-4?) I think so. Regards, Isaac.
Attachment:
pgpUACnzZjHn8.pgp
Description: PGP signature