I am making this request:
http://en.wikipedia.org/w/api.php?format=xml&action=query&titles=self-administration&prop=revisions&rvprop=content&rvparse=&rvsection=0
My goal is to get the plain-text from the intro of an article.
It gives me back some HTML in a XML file. After strip_tags and preg_replace, to remove references, I get this:
Self-administration is, in its medical sense, the process of a subject
administering a pharmacological substance to him-, her-, or itself.
[…] Cite error: There are tags on this page, but the
references will not show without a {{Reflist}} template or a
tag; see the help page.
I want to remove
Cite error: There are tags on this page, but the
references will not show without a {{Reflist}} template or a
tag; see the help page.
How can I get ride of that either with php (preg_replace?) or in my initial query (ignoring errors?).
1 Answer