Your message dated Sun, 15 Apr 2007 02:48:31 +0200 with message-id <20070415004831.GA22150@artemis> and subject line Bug#316147: iconv: options for illegal characters has caused the attached Bug report to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what I am talking about this indicates a serious mail system misconfiguration somewhere. Please contact me immediately.) Debian bug tracking system administrator (administrator, Debian Bugs database)
--- Begin Message ---
- To: Debian Bug Tracking System <submit@bugs.debian.org>
- Subject: iconv: options for illegal characters
- From: Dan Jacobson <jidanni@jidanni.org>
- Date: Wed, 29 Jun 2005 01:53:33 +0800
- Message-id: <E1DnKH7-00015b-HV@jidanni1>
Package: libc6 Version: 2.3.2.ds1-22 Severity: wishlist File: /usr/bin/iconv Tags: upstream -c is nice, but it would be nice to know just how many illegal characters were invalid characters were omitted from the output. --verbose won't say, but should. $ iconv -f gb2312 -t big5 gdxw08.htm | wc -c iconv: illegal input sequence at position 906 906 $ iconv -f gb2312 -t big5 -c gdxw08.htm | wc -c - gdxw08.htm 4585 - 4585 gdxw08.htm 9170 total The man page said "Omit invalid characters from output", well maybe it should say more, like "just send the character it can't deal with through to the output unconverted". Or better yet, give the user the choice of deleting them, sending them through, or redirecting them, etc. Greater still would be an option to "mark unconvertible characters with @--> <--@ [or customizable]"
--- End Message ---
--- Begin Message ---
- To: Dan Jacobson <jidanni@jidanni.org>, 316147-done@bugs.debian.org
- Subject: Re: Bug#316147: iconv: options for illegal characters
- From: Pierre HABOUZIT <madcoder@debian.org>
- Date: Sun, 15 Apr 2007 02:48:31 +0200
- Message-id: <20070415004831.GA22150@artemis>
- In-reply-to: <E1DnKH7-00015b-HV@jidanni1>
- References: <E1DnKH7-00015b-HV@jidanni1>
On Wed, Jun 29, 2005 at 01:53:33AM +0800, Dan Jacobson wrote: > Package: libc6 > Version: 2.3.2.ds1-22 > Severity: wishlist > File: /usr/bin/iconv > Tags: upstream > > -c is nice, but it would be nice to know just how many illegal > characters were invalid characters were omitted from the output. > --verbose won't say, but should. > > $ iconv -f gb2312 -t big5 gdxw08.htm | wc -c > iconv: illegal input sequence at position 906 > 906 > $ iconv -f gb2312 -t big5 -c gdxw08.htm | wc -c - gdxw08.htm > 4585 - > 4585 gdxw08.htm > 9170 total iconv is meant to be strict. If you want it to omit errors, then use //IGNORE after your encoding name, or //TRANSLIT to try some proximity transliterations. If you want more subtle ways, recode is the tool you want. -- ·O· Pierre Habouzit ··O madcoder@debian.org OOO http://www.madism.orgAttachment: pgp4LTsSCJNqB.pgp
Description: PGP signature
--- End Message ---