I have a paragraph of text in a javascript variable called ‘input_content’ and that text contains multiple anchor tags/links. I would like to match all of the anchor tags and extract anchor text and URL, and put it into an array like (or similar to) this:
Array ( [0] => Array ( [0] => <a href='http://yahoo.com'>Yahoo</a> [1] => http://yahoo.com [2] => Yahoo ) [1] => Array ( [0] => <a href='http://google.com'>Google</a> [1] => http://google.com [2] => Google ) )
I’ve taken a crack at it (http://pastie.org/339755), but I am stumped beyond this point. Thanks for the help!
This assumes that your anchors will always be in the form
<a href='...'>...</a>i.e. it won’t work if there are any other attributes (for example,target). The regular expression can be improved to accommodate this.To break down the regular expression:
Each call to our anonymous function will receive three tokens as the second, third and fourth arguments, namely arguments[1], arguments[2], arguments[3]:
We’ll use a hack to push these three arguments as a new array into our main
matchesarray. Theargumentsbuilt-in variable is not a true JavaScript Array, so we’ll have to apply thesplitArray method on it to extract the items we want:This will extract items from
argumentsstarting at index 1 and ending (not inclusive) at index 4.Gives: