Recently I worked on a project where one of the requirements was to support all languages (Arabic, Chinese, German, Hindi, etc .. ). Doing so was not as straight forward as I had thought, especially when the multi-languages data passes through different layers/ frameworks in your application.
Basically, to achieve this requirement you have to use the proper encoding system that is capable of encoding all possible characters, or code points, in Unicode.
UTF-8 is the dominant character encoding for the World Wide Web, accounting for 85.1% of all Web pages.
After doing some research I was able to configure UTF-8 Character Encoding all the way from front-end to the data base and vice versa. So here is a summary of what I have done:
Even though configurations may be different from one application to another, the concept and the main components of each web application should be relatively similar. So I will try to explain it in a generic way and use the specific technology I used in my project as an example.
The following are areas that you should be looking at in your application :
Database configuration: If your application stores data in many languages, then you have to make sure the database and its tables are created with UTF-8 encoding options. Usually most databases don’t support UTF-8 encoding by default, so you have explicitly include it in your queries. In my case I have MYSQL database:
-- To create database with utf-8 support:
CREATE DATABASE your_db_name CHARACTER SET utf8 COLLATE utf8_bin;
-- To create table with utf-8 support:
CREATE TABLE my_table (
user_id INT(11) NOT NULL AUTO_INCREMENT,
first_name VARCHAR(45),
PRIMARY KEY (user_id))
ENGINE=INNODB
DEFAULT CHARACTER SET = utf8
COLLATE = utf8_bin;
-- If database already exists:
ALTER DATABASE your_db_name CHARACTER SET utf8 COLLATE utf8_unicode_ci;
-- If table already exists:
ALTER TABLE your_table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Application Datasource Connection: after configuring you database to support UTF-8 encoding, you have make sure that datasource connection in your application is established with UTF-8 option. If you do not do this step the data will be stored as “?????????”. In MYSQL all you need to do is to add the encoding options at the end of your JDBC url:
${db.config.url}?useUnicode=true&characterEncoding=UTF8
- Web Server settings: if you are using apache tomcat then you should update the connector tag in server.xml file to support utf-8 :
<Connector port="8080" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" URIEncoding="UTF-8" />
Maven project build: add the following maven property into your pom.xml. That will tell Maven to build your project source files (such as properties, messages, … ) using utf-8 encoding:
<properties>
<project.build.sourceEncoding>
UTF8
</project.build.sourceEncoding>
</properties>
JSP page encoding: in order for the JSP to process all possible characters, you have to do one of the following:
Add the pageEncoding tag into every JSP:
<%@ page pageEncoding="utf-8" %>
Configure your web.xml so it will add into every JSP by default:
<!--(so that you don’t have to add <%@ page pageEncoding="utf-8" %> to every jsp file) and-->
<jsp-config>
<jsp-property-group>
<url-pattern>*.jsp</url-pattern>
<page-encoding>UTF-8</page-encoding>
</jsp-property-group>
</jsp-config>
Spring encoding filter: add Spring character encoding filter into web.xml. Make sure to have this filter come first before any other filters:
<filter>
<!-- This filter has to come before other filters. -->
<filter-name>characterEncodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>characterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
- Spring message converter: This is important especially if your are using post parameters and RequestBody in your spring controllers.
<!-- Enable @Controller annotation support -->
<mvc:annotation-driven>
<mvc:message-converters register-defaults="true">
<bean class="org.springframework.http.converter.StringHttpMessageConverter">
<property name="supportedMediaTypes" value = "text/plain;charset=UTF-8" />
</bean>
</mvc:message-converters>
</mvc:annotation-driven>
No comments:
Post a Comment