Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8584259
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T21:43:55+00:00 2026-06-11T21:43:55+00:00

I am trying to convert, from a textarea input ( $_POST[‘content’] ), all urls

  • 0

I am trying to convert, from a textarea input ($_POST['content']), all urls to link.

$content = preg_replace('!(\s|^)((https?://)+[a-z0-9_./?=&-]+)!i', ' <a href="$2" target="_blank">$2</a> ', nl2br($_POST['content'])." ");
$content = preg_replace('!(\s|^)((www\.)+[a-z0-9_./?=&-]+)!i', '<a target="_blank" href="http://$2"  target="_blank">$2</a> ', $content." ");

Target link formats: www.hello.com or http(s)://(www).hello.com

But this seem to break any iframe, image or similar,

How is/are the right regex that will ignore urls in html tags?

Note: I know I need two expressions; one to detect no protocol links (like www.hello.com, so I need to prepend it) and another one to detect urls with protocol (so no need to prepend).

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T21:43:56+00:00Added an answer on June 11, 2026 at 9:43 pm

    Your code as it is should not be much of a problem within iframes and so on, because in there you usually have a " in front of your URL and not a space, as your pattern requires.

    However, here is different solution. It might not work 100% if you have single < or > within HTML comments or something similar. But in any other case, it should server you well (and I do not whether this is a problem for you or not). It uses a negative lookahead to make sure that there is no closing > before any opening < (because this means, you are inside a tag).

    $content = preg_replace('$(\s|^)(https?://[a-z0-9_./?=&-]+)(?![^<>]*>)$i', ' <a href="$2" target="_blank">$2</a> ', $content." ");
    $content = preg_replace('$(\s|^)(www\.[a-z0-9_./?=&-]+)(?![^<>]*>)$i', '<a target="_blank" href="http://$2"  target="_blank">$2</a> ', $content." ");
    

    In case you are not familiar with this technique, here is a bit more elaboration.

    (?!        # starts the lookahead assertion; now your pattern will only match, if this subpattern does not match
    [^<>]      # any character that is neither < nor >; the > is not strictly necessary but might help for optimization
    *          # arbitrary many of those characters (but in a row; so not a single < or > in between)
    >          # the closing >
    )          # ends the lookahead subpattern
    

    Note that I changed the regex delimiters, because I am now using ! within the regex.

    Unless you need the first subpattern (\s|^) for the URLs outside of tags as well, you can now remove that, too (and decrease the capture variables in the replacement).

    $content = preg_replace('$(https?://[a-z0-9_./?=&-]+)(?![^<>]*>)$i', ' <a href="$1" target="_blank">$1</a> ', $content." ");
    $content = preg_replace('$(www\.[a-z0-9_./?=&-]+)(?![^<>]*>)$i', '<a target="_blank" href="http://$1"  target="_blank">$1</a> ', $content." ");
    

    And lastly… do you intend not to replace URLs that contain anchors at the end? E.g. www.hello.com/index.html#section1? If you missed this by accident, add the # to your allowed URL characters:

    $content = preg_replace('$(https?://[a-z0-9_./?=&#-]+)(?![^<>]*>)$i', ' <a href="$1" target="_blank">$1</a> ', $content." ");
    $content = preg_replace('$(www\.[a-z0-9_./?=&#-]+)(?![^<>]*>)$i', '<a target="_blank" href="http://$1"  target="_blank">$1</a> ', $content." ");
    

    EDIT: Also, what about + and %? There are also a few other characters that are allowed to appear in a URL without being encoded. See this. END OF EDIT

    I think this should do the trick for you. However, if you could provide an example that shows working and broken URLs (with the code you have), we could actually provide solutions that are tested to work for all of your cases.

    One final thought. The proper solution would be to use a DOM parser. Then you could simply apply the regex you already have only to text nodes. However, your concern for the HTML structure is very restricted, and that makes your problem regular again (as long as you do not have unmatched ‘<‘ or ‘>’ in HTML comments or JavaScript or CSS on the page). If you do have those special cases, you should really look into a DOM parser. None of the solutions presented here (so far) will be safe in that case.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am new to JAVA, I am trying to convert an input from a
I am trying to convert from any base to base 10. For an input
There is some code that I'm trying to convert from IList to IEnumerable :
Trying to convert output from a rest_client GET to the characters that are represented
I'm trying to convert an animation from cocos2d to cocos2d-x but to no avail.
I am trying to convert a date from yyyy-mm-dd to dd-mm-yyyy (but not in
I'm trying to convert my site from using tables to just using css and
I am trying to convert some pages from my app to use cfc's, and
I'm trying to convert some code from Richfaces 4 showcase to use CDI instead
I'm trying to convert a series of bytes from hex to bin using bash.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.