Multibyte encoding - what should a package provide?
I have had a request for a postgresql package with multibyte support.
Such support would enable postgresql to store data in different character
sets, so that data in Russian, Greek, Chinese or other scripts could be
stored and sorted properly.
There are several issues that I would like clarified:
1. Whether to have separate packages with and without multibyte support?
In one sense, this is only an issue because of Anglo-Saxon parochialism.
I'm inclined to say: of course I should enable multibyte support
without splitting the package, since the majority of the world's
population will need it, and many even of the English-speaking world
may sometimes need to record data in other character sets.
(Without multibyte support, it is not possible to specify a character
set for a database.) postgresql already includes locale support,
which has some detrimental effect on performance.
2. Which character set to make the default?
The choice here seems to be either Unicode (UTF-8) or SQL-ASCII. Most
other choices limit me to the character set of a few languages. Of
course, SQL-ASCII is effectively the character set for American English,
so perhaps Unicode is the only choice.
3. Whether Debian policy has anything to say on the matter?
As more packages are internationalised, it will become more important
for us to have a consistent policy.
The above bug contains some discussion of this from the postgresql list,
and other messages may be seen in the archive of the psql-hackers mailing
list in the thread "Implications of multi-byte support in a distribution"
See also: /usr/doc/postgresql-doc/README.mb.gz (postgresql-doc package)
Vote against SPAM: http://www.politik-digital.de/spam/
Oliver Elphick Oliver.Elphick@lfix.co.uk
Isle of Wight http://www.lfix.co.uk/oliver
PGP key from public servers; key ID 32B8FAA1
"But God said to him, You fool! This very night your
soul is required of you; and now who will own what you
have prepared? So is the man who lays up treasure for
himself, and is not rich toward God."