[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: How to guess or check encoding of text file.



Hi,

From: Osamu Aoki <osamu@debian.org>
Subject: How to guess or check encoding of text file.
Date: Sun, 5 Jan 2003 22:56:11 -0800

> I know how to convert UTF-8 and ISO-8859-1 (iconv).
> 
> Is there good utility to guess what encoding a test file is using?  This
> does not need to be generic but just for western language is fine.

How about trying iconv?  If it is not intended encoding, it errors.

For example, if

    iconv -f UTF-8 -t ISO-8859-1 <somefile

succeeds, it means the file is UTF-8.  Thus, I wrote the following script:


#!/bin/sh
if iconv -f UTF-8 -t UTF-8 <$1 &>/dev/null
then
  echo UTF-8
else
  echo ISO-8859-1
fi


---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/




Reply to: