I want to design a Semantic Search engine for my final year Master’s degree. I have been doing a fair amount of reading both casually on the web and academic papers so I am not a total noob in this field.
My aim is to build a semantic search engine, which parses out the HTML content into its equivatlent RDF triples,stores the triples in a triplestore, through which the engine will try to respond to the query fired using SPARQL. I want to do something out of the box unlike the other students . So, I decided to build a semantic search engine.
Right now, I had a running search engine using Solr which performs keyword search, what I want to do is the semantic search. I know some open source tools regarding Web 3.0 but not sure whether they will be compatible with Solr or not.
So, can you please provide me some help for building the same.
Thanks.
Regards
Although it sounds hard, but you will not be able to capture everything.
You need a lot of data. Of course, there already is a lot of data arranged in formats like owl and rdf which you may use (e.g. WordNet, Yago, GeoNames etc), but although they are of huge size, they only focus on very small portions of a possible discourse universe.
Developing a good semantic search takes a lot of resources and brain power. Projects, like for example KompParse at the German Research Center for Artificial Intelligence, which only focus on a small part of human conversation (gossip or buying furniture) have been running for several years with several employees by now and are still just “ok”.
Understanding semantics has already been implemented in different search engines, take google for example, or wolfram alpha. So this topic might not even be as much “out of the box” as you think.
So I will go with user723630 and strongly advise you, to focus on a smaller topic. You will still achieve a lot, but you will not get frustrated.