I just get a strange encoding problem in java web project.
System.out.println("search url: " + searchURL);
searchURL = new String(searchURL.getBytes("utf-8"), "utf-8");
System.out.println("test===" + new String(searchURL.getBytes("utf-8")));
I test the code above in java main function, and in chinese character it works all right.
output:
search url: https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query=%27机器 猫%27&$format=json&$skip=0
test===https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query=%27机器 猫%27&$format=json&$skip=0
But when runs this code in tomcat.
output:
search url: https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query=%27机器 猫%27&$format=json&$skip=0
test===https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query=%27鏈哄櫒 鐚?27&$format=json&$skip=0
then i test this in tomcat:
searchURL = new String(searchURL.getBytes("utf-8"), "utf-8");
System.out.println(new String(searchURL.getBytes("gbk"));
System.out.println(new String(searchURL.getBytes("gb2312"));
both above is ok. so why ?
Any suggestion will be appreciated, really thx !
the default charset will be different between your jvm and the tomcat jvm
try
this will use the default charset to encode the string which may or may not be utf-8
so while the byte array is utf-8 the decoder may expect something else.