WordPress and MySQL character encoding

Recently I moved my WordPress blog to a new server. I took the dump of the old database and imported it into the new MySQL server. Everything was fine except that I stated getting some strange characters in my posts. For instance I was getting (Nov — 3rd — 2007) instead of (Nov – 3rd – 2007).

It took me lot of research and googling to find the solution and so I thought of sharing it here so that it would be helpful for others who might face the same problem.

First I raised a support request in WordPress, but didn’t get a reply. After some googling, I found that it was due to wrong character set in my new MySQL server. Instead of having utf8 as the character set the MySQL database server was running in the default latin1 character set.

I changed the character set in the MySQL my.conf file and also in the wp-config.php file of WordPress and re imported the tables. Even this didn’t solve my problem. Later I found that there were certain characters in my wp-posts table which were encoded in latin1 character set even though the table is set to utf8 character set.

I then exported the table using a tool called Heidisql (which is by the way an excellent alternative to the command line MySQL client). I then opened the sql file in a text editor and changed all instances of latin1 to utf8 (basically a find/replace). I saved the file and imported the tables again and the junk characters are gone. :)

So the lesson learned the harder way, KEEP EVERYTHING IN UTF-8, ABSOLUTELY EVERYWHERE, FROM DAY ONE. You’ll be glad you did some day.

Related posts

Tags: , , , ,

1 Comments so far

Trackback URI | Follow up comments through RSS Feed | Post a comment

Leave a Reply