I’m using Rails 3.0.3 with REE ( Ruby 1.8.7 ) and gem ‘mysql2’, ‘0.2.6’
There’s a search feature in my project that enable people to use the GET method using URL or using forms and then generate the URL.
Example:
I want to search:
origin city: “Århus, Denmark” and destination city: “Asunción, Paraguay“
they both have a special character: “Å” and “ó“, so the URL will be generated like this when someone click the search button.
?&origin=%C5rhus%2C%20Denmark&destination=Asunci%F3n%2C%20Paraguay
Problem:
When i search that city, it’s not unescaped like i want ( i tried using like CGI, URI, even some gems).
When i see at the console, ActiveRecord received the query like this:
Parameters: {"destination"=>"Asunci�n, Paraguay", "origin"=>"�rhus, Denmark", "sort"=>"newest"}
City Load (0.1ms) SELECT `cities`.* FROM `cities` WHERE (`cities`.`name` = '�rhus') ORDER BY cities.name ASC
City Load (6.8ms) SELECT `cities`.* FROM `cities` WHERE (`cities`.`name` = 'Asunci�n, Paraguay') ORDER BY cities.name ASC
Conclusion: the cities can’t be found 🙁
But, i found an interesting thing:
-
When i made an error on the file asociated with this function, the output will be like this :
Request
Parameters: {"destination"=>"Asunción, Paraguay", "origin"=>"Århus, Denmark", "sort"=>"newest"}
it’s a valid one!
Question:
Do you guys have an idea how to solve this? Thanks in advance 🙂
You’re right, it looks like you have an encoding problem somewhere. The 0xC5 character is “Å” in ISO-8859-1 (AKA Latin-1), in UTF-8 it would be
%C3%85in the URL.I suspect that you’re using JavaScript on the client side and that your JavaScript is using the old
escapefunction to build the URL,escapehas some issues with non-ASCII characters. If this is the case, then you should upgrade your JavaScript to useencodeURIComponentinstead. Have a look at this little demo and you’ll see what I’m talking about:If you can’t change the client-side script then you can do it the hard way in Ruby using
force_encodingandencoding:You should get something like
"\xC5rhus, Denmark"fromparamsand you could unmangle that with:Dealing with this on the server side would be a last resort though, if your client-side code is sending back incorrectly encoded data then you’ll be left with a pile of guesswork on the server to figure out what encoding was actually used to get it into the URL.