[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#618923: ITP: libguess -- Character-set detection library



于 2011年03月20日 00:24, Bilal Akhtar 写道:
Package: wnpp
Severity: wishlist
Owner: Bilal Akhtar<bilalakhtar@ubuntu.com>


* Package name    : libguess
   Version         : 0.1
   Upstream Author : William Pitcock<nenolod@atheme.org>
* URL             : http://www.atheme.org/project/libguess
* License         : BSD
   Programming Lang: C
   Description     : Character-set detection library

LibGuess is a high-speed character-set detection library. LibGuess
employs discrete-finite automata to deduce the character set of the
input buffer. The advantage of this is that all character sets can
be checked in parallel, and quickly. Internally, LibGuess passes a
byte to each DFA on the same pass, meaning that the winning character
set can be deduced as efficiently as possible. LibGuess is fully
reentrant, using only local stack memory for DFA operations.




Firstly, I'm looking forward to such an implemention that could work well. But I have to say there are so many attempts to deal with GBK and UTF-8 detections but none of them do the exactly right thing, wish this one helps, :-)

--
Regards,
Aron Xu



Reply to: