Okay, so I am using this for a reddit bot, but I want to be able to figure out HOW to log in to any website.
If that makes sense….
I realise that different websites use different login forms etc. So how do I figure out how to optimise it for each website? I’m assuming I need to look for something in the html file but no idea what.
I do NOT want to use Mechanize or any other library (which is what all the other answers are about on here and don’t actually help me to learn what is happening), as I want to learn by myself how exactly it all works.
The urllib2 documentation really isn’t helping me.
Thanks.
I’ll preface this by saying I haven’t done logging in in this way for a while, so I could be missing some of the more ‘accepted’ ways to do it.
I’m not sure if this is what you’re after, but without a library like
mechanizeor a more robust framework likeselenium, in the basic case you just look at the form itself and seek out theinputs. For instance, looking atwww.reddit.com, and then viewing the source of the rendered page, you will find this form:Here we see a few
input‘s –op,user,passwdandrem. Also, notice theactionparameter – that is the URL to which the form will be posted, and will therefore be our target. So now the last step is packing the parameters into a payload and sending it as aPOSTrequest to theactionURL. Also below, we create a newopener, add the ability to handle cookies and add headers as well, giving us a slightly more robust opener to execute the requests):Note that this can get much more complicated – you can also do this with GMail, for instance, but you need to pull in parameters that will change every time (such as the
GALXparameter). Again, not sure if this is what you wanted, but hope it helps.