viernes, 20 de febrero de 2009

Mysql UTF8 Encoding, and how to save Chinese/Japanese characters.

One of the requirements of the current application that I’m developing is to be able to save Unicode characters, but when we save the data to the db, the unicode characters were converted to a sequence of question symbols  (like these ??????) and that was really annoying.

Well we do some googling and found the answer here:

http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

Well to make the story short, if you want to store unicode characters be sure that the my.ini (mysql.cnf in linux) has the following lines in it.

collation_server=utf8_unicode_ci
character_set_server=utf8
character_set_client=utf8

I don’t know why, but It seems than the Mysql Connector for .Net was sending the queries statements  as non unicode, so, when the data arrives to the db it arrives with the wrong encoding and it stores the question symbols instead of the unicode characters.

If you don’t have control of the mysql Server, maybe you could run the equivalent sql statements before the queries, something like this :

SET NAMES UTF-8

I don’t know how to do it when you use NHibernate, (because the sql statements are sent by the NHibernate layer), but I read some posts that said it works.

Hopefully it can helps someone else.

Important note: those lines should be added after the default-character-set to work properly