[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Draft] Writing i18n apps with glibc 2.2



: > Or is it that when MB_CUR_MAX!=1, you really shouldn't "check whether a byte  
: > is printable"?  It doesn't make much sense to me: even if 0xA6 or 0xA7
: > appear as a byte in a BIG5 character, there is no guarantee that D8A7, say,
: > is a valid BIG5!  So it's neither "printable" nor "unprintable"...
: 
: No, the point I was making was, say you have 0x20 0xD0 0xF6 (a space
: followed by some Big5 character).  First you check isprint(0x20); that
: succeeds, so you move on to isprint(0xD0).  Oops, 0xD0 is not printable.
: But MB_CUR_MAX = 2.  So we check it now with 2 bytes together ...
: isprint(0xD0F6) ... great, that's printable, and we move on to the next
: byte ...

My opinion is that the program should not do this manually. Whether
a multi-byte string or a multi-byte character is printable or not
should be left for glibc to determine it. For your example, I will
write a program like this:

	char buf[100], *s;
	wchar_t wbuf[100];
	mbstate_t ps;

	buf[0] = (char)0x20;
	buf[1] = (char)0xD0;
	buf[2] = (char)0xF6;
	buf[3] = '\0';

	/* set the locale, etc .... */

	memset(&ps, '\0', sizeof(mbstate_t));
	s = buf;
	if (mbsrtowcs(wbuf, &s, 100, &ps) == (size_t)-1) {
	    /* pointer s will point to the byte which mbsrtowcs() cannot
	       convert. */
	}


T.H.Hsieh



Reply to: