I have the following within an XHTML document:
<script type="text/javascript" id="JSBALLOONS">
function() {
this.init = function() {
this.wAPI = new widgetAPI('__BALLOONS__');
this.getRssFeed();
};
}
</script>
I’m trying to select everything in between the two script tags. The id will always be JSBALLOONS if that helps. I know how to select that including the script tags, but I don’t know how to select the contents excluding the script tags. The result of the regular expression should be:
function() {
this.init = function() {
this.wAPI = new widgetAPI('__BALLOONS__');
this.getRssFeed();
};
}
(Updated post specifically for a Javascript solution.)
In Javascript, your code might look like this:
That part between parentheses
([\S\s]*?)is saved by the regex engine and is accessible to you after a match is found. In Javascript, you can useRegExp.$1to reference to the matched part inside the script tags. If you have more than one of such a group, surrounded by(), you can refer to them withRegExp.$2, and so on, up toRegExp.$9.Javascript will not match newline characters by default, so that is why we have to use
([\S\s]*?)rather than(.*?), which may make more sense. Just to be complete, in other languages this is not necessary if you use thesmodifier (/.../s).(I have to add that regexes are typically very fragile when scraping content from HTML pages like this. You may be better off using the jQuery framework to extract the contents.)