Wrong charset in a tomcat servlet

candamil · Tux's lil' helper Joined: 19 Mar 2012 Posts: 96

Hi, guys, I hope you can help me with this problem. First of all, the servlet works fine, I have been using it for several months in several systems, but it fails now, in a new gentoo installation, so it's problem of the installation.

Ok, let me explain. The servlet lists some items within a cathegory, inserted from a web browser, and stores it in a file, whose name is the hash of the cathegory, created with the java function hashCode(). The servlet works with UTF-8. The files are created also in UTF-8. I had some files created in the other systems, but when I tried to get into one of them, whose name has an accent (Películas), the servlet wasn't able to find the file. I tried to create a new one, and I discovered that it's not working properly with accents (so the charset is wrong).

I opened the new file with kwrite. It should have this: Películas. If I set the charcode to UTF-8, I get this: PelÃculas, but if I set the charcode to ISO, I get this: PelÃÂculas. I have tried with several encodings, but it's wrong with all of them. It seems it writes the word with an enconding, it stores it with another... so the word is wrong. And because of the wrong word, the hash and the name of the file are wrong (556892427 instead of 1014027990).

The system is configured in UTF-8:

86me · n00b Joined: 20 Jul 2009 Posts: 20

The xml links you listed are not publicly viewable.

candamil · Tux's lil' helper Joined: 19 Mar 2012 Posts: 96

Wops, you are totally right. My fault. I hope it works this time:

http://www.filedropper.com/correct
http://www.filedropper.com/incorrect