I am writing an “auto-wikifier” tool using HTML and JavaScript. For each word in the text to be wikified, I need to obtain a list of pages that contain that word (so that the matching phrases in the text can be automatically wikified, if they are found). Is there a way to obtain a list of all Wikipedia pages that contain a specific word, using one of Wikipedia’s APIs or web services?
function getMatchingPageTitles(theString){
//get a list of all matching page titles for a specific string, using one of Wikipedia's APIs or web services
}
First, I’m not sure I understand how would something like that be useful. (Wikipedia has articles for all the common words and I don’t think links to them would be of any use.)
But if you really wanted to do something like this, I think a much better way would be to use the API to find out which words from your input text have articles.
For example, for the string
I am writing an "auto-wikifier" tool, your query could look something like:http://en.wikipedia.org/w/api.php?format=xml&action=query&titles=I|am|writing|an|auto-wikifier|tool
And the answer is:
Few notes:
missing=""attribute.titlesparameter has a limit of 50 per one query.