I was playing with Jericho’s html parser, but I can’t find information or rather an example on how to set or change user-agent. I found the class Config, but don’t know how to use it, can anyone else give me an example, please?
I managed to parse a website as I want, but I’m not sure whether Jericho’s parser adds a user agent. As you might know I want a proper user agent to prevent a site prohibits me for accessing its content.
Thank you.
Further to my comment above, make sure you always obey robots.txt. Aside from that, the code you want should look something like this.
Can’t run it from work, due to firewall issues, but I think this should work for you. If not, something similar will do the trick.