Skip to Content
Author's profile photo Bence Legradi

How to test Tomcat’s character encoding configuration

Hi All,

I’ve seen many cases with displaying non-latin characters in Web Intelligence or other reporting tools.

It’s not always so easy to identify where is the issue coming from, so I’d like to help you with an easy testing method.

Prerequisites for this test is to make sure that the your database is configured to UTF-8 and you can also display non-latin characters using other tools.

  1. Please make sure that URIEncoding is set to UTF8
    in Tomcat’s server.xml file.

    You can validate this by following the steps written in KBA 1497582.

  2. Open Notepad++ and insert the following code
    which gives back the content of a form using the GET method:

    <%@ page contentType=“text/html; charset=UTF-8”%>
    <!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”>
    <html>
      
    <head>
        
    <title>Character encoding test page</title>
      
    </head>
      
    <body>
        
    <p>Data posted to this form was:
        
    <%

           request.setCharacterEncoding(“UTF-8”);

           out.print(request.getParameter(“mydata”));

    %>

         </p>
        
    <form method=“GET” action=“test1.jsp”>
          
    <input type=“text” name=“mydata”>
          
    <input type=“submit” value=“Submit”/>
          
    <input type=“reset” value=“Reset”/>
        
    </form>
      
    </body>
    </html>

  3. Save the file as “test1.jsp”

    /wp-content/uploads/2015/06/1_736270.png

  4. Navigate to Tomcat’s ROOT directory

    /wp-content/uploads/2015/06/2_736277.png

  5. Copy “test1.jsp” into this folder
  6. Open “test1.jsp” using IE. In my example the URL is: http://localhost:8080/test1.jsp

    /wp-content/uploads/2015/06/3_736278.png

  7. Type in some non-latin characters, e.g. Chinese letters (形声字 / 形聲字)

    /wp-content/uploads/2015/06/4_736279.png

  8. After clicking on the Submit button, you should get back the results.

    /wp-content/uploads/2015/06/5_736295.png
    Note: In the current example Tomcat’s UTF-8 configuration seems to be correct.

  9. If you remove URIEncoding=”UTF-8″ parameter from server.xml you should get something like this:

    /wp-content/uploads/2015/06/6_736296.png

  10. Another possible output of a wrong configuration:

    /wp-content/uploads/2015/06/7_736297.png

Please note that this is not a fail-safe method to make sure that the configuration is correct/incorrect, but it might help you to narrow down the issue.

I hope it will be useful for you.

Regards,

Bence

Assigned Tags

      2 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member

      Very much helpful , Bence 😀

      Author's profile photo Bence Legradi
      Bence Legradi
      Blog Post Author

      Thank you!