[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Detecting UTF-8 (Was: Re: RFS: rgbpaint)



onsdag den 12 januari 2011 klockan 23:17 skrev Muammar El Khatib detta:
> On Wed, Jan 12, 2011 at 02:47:28AM +0100, Jakub Wilk wrote:
> > * Muammar El Khatib <muammarelkhatib@gmail.com>, 2011-01-12, 00:49:
> > >$ file debian/copyright
> > >debian/copyright: ASCII Pascal program text
> > 
> > Uhm, sorry, no, file cannot be used to determine encodings. Besides,
> > UTF-8 is a superset of ASCII, so everything is all right according
> > to file.
> 
> What would you suggest to me for determining encodings? Something like enca can
> be useful in these cases?

The method of Lintian (in /usr/share/lintian/lib/Util.pm, function
file_is_encoded_in_non_utf8) is to apply

    iconf -f utf-8 -t utf-8 filename

and evaluate the presence of an error, which then proves non-UTF-8
to be at hand.

Best regards,
   Mats Erik Andersson, DM


Reply to: