console.log( html.match( /<a href=(.*?)>[^<]+<\/a>/g )); Instead of returning just the urls like: http://google, http://yahoo.com

Question

0

Asked: May 23, 20262026-05-23T00:03:34+00:00 2026-05-23T00:03:34+00:00

console.log( html.match( /<a href=(.*?)>[^<]+<\/a>/g )); Instead of returning just the urls like: http://google, http://yahoo.com

0

console.log( html.match( /<a href="(.*?)">[^<]+<\/a>/g ));

Instead of returning just the urls like:

http://google, http://yahoo.com

It’s returning the entire tag:

<a href="http://google.com">Google.com</a>, <a href="http://yahoo.com">Yahoo.com</a>

Why is that the case?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T00:03:35+00:00

You want RegExp#exec and a loop accessing the element at the match result’s 1 index, rather than String.match. String.match doesn’t return the capture groups when there’s a g flag, just an array of the elements at index 0 of each match, which is the whole matching string. (See Section 15.5.4.10 of the spec.)

So in essence:

var re, match, html;

re = /<a href="(.*?)">[^<]+<\/a>/g;
html = 'Testing <a href="http://yahoo.com">one two three</a> <a href="http://google.com">one two three</a> foo';

re.lastIndex = 0; // Work around literal bug in some implementations
for (match = re.exec(html); match; match = re.exec()) {
  display(match[1]);
}

Live example

But this is parsing HTML with regular expressions. Here There Be Dragons.

Update re dragons, here’s a quick list of things that will defeat this regexp, off the top of my head:

Anything other than exactly one space between the a and href, such as two spaces rather than one, a line break, class='foo', etc., etc.
Using single quotes rather than double quotes around the href attribute.
Not using quotes around the href attribute at all.
Anything after the href attribute that also uses double quotes, e.g.:
```
<a href="http://google.com" class="foo">
```

This is not to be down on your regexp, it’s just to highlight that regular expressions can’t be reliably used on their own to parse HTML. They can form part of the solution, helping you scan for tokens, but they can’t reliably do the whole job.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

console.log( html.match( /<a href=(.*?)>[^<]+<\/a>/g )); Instead of returning just the urls like: http://google, http://yahoo.com

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply