I am running in to OpenURI::HTTPError: 403 Forbidden error
when I try to open a URL with a comma (OR other special characters like .).
I am able to open the same url in a browser.
require 'open-uri'
url = "http://en.wikipedia.org/wiki/Thor_Industries,_Inc."
f = open(url)
# throws OpenURI::HTTPError: 403 Forbidden error
How do I escape such URL?
I have tried to escape the url with CGI::escape and I get the same error.
f = open(CGI::escape(url))
Typically, one would simply require the module
cgi, then useCGI::escape(str).However, this doesn’t seem to work for your particular instance, and still returns a 403. I’ll leave this here for reference, regardless.
Edit: Wikipedia is refusing your requests because it suspects that you are a bot. It would seem that certain pages that are clearly content are granted to you, but those that don’t match its “safe” pattern (e.g. those that contain dots or commas) are subject to its screening. If you actually output the content (I did this with
Net::HTTP), you get the following:Providing a user-agent string, however, solves the issue: