I am looking to create a simple webservice to crawl webpages on specific websites and look for a person’s name. Anybody know if there are any examples out there of this, or can anyone help me with the start of this?
Edit: I should mention I want to do this with Visual Studio C#. I will only be looking at English news sites that I specify.
Here is a simple function that returns true if a Web page contains a person’s name:
For finding the links within the page, check out this question:Parse HTML links using C#
You can collect distinct Urls throughout the site and run the code above for each Url you find.
Also, type this into Google to see what they find.
site:www.somesite.com "John Doe"