I am trying to do some automation in a Python script and I have run into a problem. I am trying to do a POST to a server.
url = 'http://www.example.com'
params = {'arg0': 'value', 'arg1': '+value'}
f = urllib.urlopen(url, urllib.urlencode(params))
print f.read()
I have done a wireshark capture of the equivalent browser operation, where the second arg, arg1 is passed as +value, however when I do it with Python the + gets changed to %2B, i.e.
Line-based text data: application/x-www-form-urlencoded
arg0=value&arg1=%2Bvalue
when it should be:
Line-based text data: application/x-www-form-urlencoded
arg0=value&arg1=+value
I have also used the Requests module and it seems to do the same thing.
url = 'http://www.example.com'
params = {'arg0': 'value', 'arg1': '+value'}
f = requests.post(url, params)
Google is not your friend when you have a problem related to ‘+’ as it seems to be a catch all for so much else.
The
+character is the proper encoding for a space when quoting GET or POST data. Thus, a literal+character needs to be escaped as well, lest it be decoded to a space on the other end. See RFC 2396, section 2.2, section 3.4 and the HTML specification,application/x-www-form-urlencodedsection:If you are posting data to an application that does not decode a
+character to a space but instead treats such data as literal plus signs instead, you need to encode your parameters yourself using theurllib.quotefunction instead, specifying that the+character is not to be encoded:Demo:
When using
requests, you can simply pass in the result of the above function as thedatavalue, but in that case you need to manually set the content type: