from mysql jdbc 中文乱码解决方案
对于mysql数据库而言,解决中文乱码,可以从两个方向考虑,一个是通过修改mysql服务器端的配置文件/etc/mysql/my.cnf来支持中文,比如:
...
[mysql]
default-character-set=utf8
...
但是,修改配置文件需要重启服务,尤其是对于已经在线上运行的数据库或者“老”的数据库实例(有可能是多实例或集群)而言,显然通过修改配置文件来操作是不适合的,甚至是不被DBA允许的,通常能想到的办法是在客户端或者JDBC连接时定制支持中文的编码格式(通常使用UTF-8),这样插入数据的时候,让mysql为自动为我们转码,可行的办法有两种:
1、如果是通过DriverManager.getConnection(url)编码方式操作JDBC,可以在JDBC的url中追加useUnicode=true&characterEncoding=UTF-8解决乱码问题。
jdbc.url=jdbc:mysql://127.0.0.1:3306/mydb?useUnicode=true&characterEncoding=UTF-8
2、如果是通过其它数据源,比如DBCP、tomcat-jdbc、c3p0、spring-jdbc、hibernate读取配置文件,在url中追加useUnicode=true&characterEncoding=UTF-8是不起作用的,而是通过数据源自身的配置生效,比如下列配置:
<!-- Tomcat data source -->
<bean id="dataSource" class="org.apache.tomcat.jdbc.pool.DataSource">
<property name="driverClassName" value="${jdbc.driverClassName}" />
<property name="url" value="${jdbc.url}" />
<property name="username" value="${jdbc.username}" />
<property name="password" value="${jdbc.password}" />
<property name="dbProperties">
<props>
<prop key="useUnicode">yes</prop>
<prop key="characterEncoding">utf8</prop>
</props>
</property>
<!-- Configuration refer to optimizing connection performance -->
<property name="initialSize" value="10" />
<property name="maxActive" value="100" />
<property name="maxIdle" value="50" />
<property name="minIdle" value="10" />
<property name="suspectTimeout" value="60" />
<property name="timeBetweenEvictionRunsMillis" value="30000" />
<property name="minEvictableIdleTimeMillis" value="60000" />
<property name="testOnBorrow" value="true" />
<property name="validationQuery" value="SELECT 1" />
<property name="validationInterval" value="30000" />
<!-- End Configuration refer to optimizing connection performance -->
</bean>
其中:
<props>
<prop key="useUnicode">yes</prop>
<prop key="characterEncoding">utf8</prop>
</props>
等价于url中的useUnicode=true&characterEncoding=UTF-8
查看当前数据库的字符集(切换到某个数据库前后比较):
mysql> show variables like 'char_%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> use robot_classifymark;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show variables like 'char_%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
可以看到数据库robot_classifymark采用的是utf8编码的,正好对应创建数据库时指定的编码,对应建库SQL:
CREATE DATABASE IF NOT EXISTS robot_classifymark DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
如果想在命令行下插入中文,可以执行set names utf8,此命令会影响三个参数:
mysql> show variables like 'char_%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> set names utf8
-> ;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like 'char_%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
可见:执行命令set names utf8,影响了三个参数:character_set_client,character_set_connection,character_set_results,如此,可以在当前命令行下插入中文了,但是,此命令只对当前命令行有效(*),如果想每次登入mysql命令行都生效,同样需要修改mysql服务器端的配置文件/etc/mysql/my.cnf
...
[client]
default-character-set=utf8
...
所以,如果JDBC连接时,如果url未指定useUnicode=true&characterEncoding=UTF-8,可以变相的通过每次执行insert或select语句之前先执行set names utf8来插入或查询中文,显然,此方式没有在url中指定useUnicode=true&characterEncoding=UTF-8或者通过上面的数据源配置形式简洁!