As far as I know there's no jQuery way to…

Question

0

Asked: May 10, 20262026-05-10T20:13:38+00:00 2026-05-10T20:13:38+00:00

Given a html document, what is the most correct and concise regular expression pattern

0

Given a html document, what is the most correct and concise regular expression pattern to remove the query strings from each url in the document?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-10T20:13:38+00:00

You can’t usefully parse HTML with a regexp. If you know the format of the page in advance — eg.

links are always in the form < a href=’url with no unnecessary character escapes’>, or
all links are absolute, and no other non-link strings beginning with http: exist

then you can just about get away with it, but for general [X]HTML a regexp parser is unsuitable.

Depending on what language you’re using, you’d need to find either an HTML parser library (eg. Python’s BeautifulSoup), or an HTML tidier combined with a standard XML parser, then scan the document for < a> elements (and maybe others, eg. < img> if you’re interested in those?), then split the attribute value on ‘?’.

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions