Re: Postgres - Unicode - Problem
On Wednesday 11 June 2003 15:23, Andreas Tille wrote:
> CREATE DATABASE $dbtocreate ENCODING 'unicode';
I seem to remember that pg also offered something like UTF8. The point is that
'Unicode' is in most places just a buzzword. Especially in this case, the
exact encoding would be much better as Unicode can be represented with
several encodings.
> INSERT INTO i18n_translations(lang, orig, trans) values
> ('de_DE', 'public', 'öffentlich');
>
> ERROR: Unicode >= 0x10000 is not supported
So, this looks like it can only take UCS2 or UTF16. However, the question is
in what way did it interpret the command to get to a character with a
codepoint >= 0x10000 ?
Possible ways:
- UCS4: here, one char uses four bytes, but that should already have failed
for the commands before then
- USC2/UTF16: two bytes per char(plus sequences for UTF16), else the same as
above
- UTF8: one byte per char but multibyte-chars being rather common. I'm not
sure how it could interpret this, but try saving it as UTF8 (and _not_
ISO8859-1, which many editors[1] silently do).
- ASCII: using a 'signed char', they might end up with a negative codepoint
for the umlaut, resulting in an underflow and the above error.
As a last thing, there is the possibility (albeit small) that you cannot use
this in a script but only via some 'real' API (but I might be drifting into
obscure speculations here).
good luck
Ulrich Eckhardt
[1] apt-get install yudit
That is a rather capable editor that understands several encodings.
Reply to: