I want to make a code to extract the main news from a news website . News websites contain the main news , ads , reviews , copyright notice so i want to get only the main news like done in boilerpipe but i want to know how to do that .
So i want to have information about how is the process for doing this work .
Sudhanshu
the boilerpipe websites contains source code, quickstart instructions, links to the original scientific paper and to the corresponding conference presentation video:
http://code.google.com/p/boilerpipe/
This should give you a quite comprehensive set of information on how this works and how you can apply this in your scenario.
Best,
Christian