[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#316147: iconv: options for illegal characters



Package: libc6
Version: 2.3.2.ds1-22
Severity: wishlist
File: /usr/bin/iconv
Tags: upstream

-c is nice, but it would be nice to know just how many illegal
characters were invalid characters were omitted from the output.
--verbose won't say, but should.

$ iconv -f gb2312 -t big5 gdxw08.htm | wc -c
iconv: illegal input sequence at position 906
906
$  iconv -f gb2312 -t big5 -c gdxw08.htm | wc -c - gdxw08.htm
   4585 -
   4585 gdxw08.htm
   9170 total

The man page said "Omit invalid characters from output", well maybe it
should say more, like "just send the character it can't deal with
through to the output unconverted".

Or better yet, give the user the choice of deleting them, sending them
through, or redirecting them, etc.

Greater still would be an option to "mark unconvertible characters
with @--> <--@ [or customizable]"



Reply to: