[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Character Encoding (UTF-8) in PERL



Oliver König wrote:
Hello list,
we have used Debian Sarge for almost 2 years on our server. It took us a lot of hard work to set up the system, the database, PERL, Apache etc.to use UTF-8 as the default character set. But in the end it worked fine on Sarge.

When Etch came out we read that everything in Etch now defaults to UTF-8 character encoding so we decided to upgrade from from Sarge to Etch. Unfortunately after the upgrade the character encoding on our website was messed up and it looks like the reason for that is PERL. The mess is causing tremendous damage to our website.
the operating system and Apache seem to use UTF-8. So, that'S good

In mysql everything looks fine, too:
mysql> SHOW VARIABLES LIKE "character_set_%";
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

However a PERL script with dbh->do(SHOW VARIABLES LIKE "character_set_%"); returns:
character_set_client      latin1
character_set_connection  latin1
character_set_database    utf8
character_set_filesystem  binary
character_set_results     latin1
character_set_server      utf8
character_set_system      utf8
character_sets_dir        /usr/share/mysql/charsets/

How can we tell PERL to use UTF-8 as default encoding?

after connection do query "set NAMES utf8"
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

or (from man DBD::mysql):

mysql_enable_utf8

This attribute determines whether DBD::mysql should assume strings stored in the database are utf8. This feature defaults to off.

When set, a data retrieved from a textual column type (char, varchar, etc) will have the UTF-8 flag turned on if necessary. This enables character semantics on that string. You will also need to ensure that your database / table / column is configured to use UTF8. See Chapter 10 of the mysql manual for details.

Additionally, turning on this flag tells MySQL that incoming data should be treated as UTF-8. This will only take effect if used as part of the call to connect(). If you turn the flag on after connecting, you will need to issue the command "SET NAMES utf8" to get the same effect.

This option is experimental and may change in future versions.

I also mailed the package maintainer bob@debian.org but the mail could not be delivered "unknown user".




--
-------e-l-o-y----------------------------e-l-o-y-@-k-o-f-e-i-n-a-.-n-e-t------

       jak to dobrze, że są oceany - bez nich byłoby jeszcze smutniej



Reply to: