Re: How to guess or check encoding of text file.
Hi,
From: Osamu Aoki <osamu@debian.org>
Subject: Re: How to guess or check encoding of text file.
Date: Mon, 6 Jan 2003 00:17:10 -0800
> > #!/bin/sh
> > if iconv -f UTF-8 -t UTF-8 <$1 &>/dev/null
> > then
> > echo UTF-8
> > else
> > echo ISO-8859-1
> > fi
>
> Bingo :) Maybe this can be wishlist for iconv.
You mean, this script should be included in glibc package?
I don't think so, because this script is based on too many assumptions.
Generally, encoding guessing *must* be based on many assumptions,
otherwise the guessing is too poor to be useful.
Now, the assumption is that the input file must be either UTF-8 or
ISO-8859-1. Only by adding ISO-8859-2 as a candidate, guessing will
be impossible.
---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
Reply to: