Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8406623
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 9, 20262026-06-09T23:09:35+00:00 2026-06-09T23:09:35+00:00

I’m using phpBB3 to make a message board. There is a built in feature

  • 0

I’m using phpBB3 to make a message board. There is a built in feature that takes all URLs in posts and renders then as links. I want to make it so that ONLY local links are made clickable.

phpbb3 uses regex on the text of a post and for each match changes it to a link:

if ($somestuff){
// matches a xxxx://aaaaa.bbb.cccc. ...
$magic_url_match[] = '#(^|[\n\t (>.])(' . "[a-z]$scheme*:/{2}(?:(?:[a-z0-9\-._~!$&'($inline*+,;=:@|]+|%[\dA-F]{2})+|[0-9.]+|\[[a-z0-9.]+:[a-z0-9.]+:[a-z0-9.:]+\])(?::\d*)?(?:/(?:[a-z0-9\-._~!$&'($inline*+,;=:@|]+|%[\dA-F]{2})*)*(?:\?(?:[a-z0-9\-._~!$&'($inline*+,;=:@/?|]+|%[\dA-F]{2})*)?(?:\#(?:[a-z0-9\-._~!$&'($inline*+,;=:@/?|]+|%[\dA-F]{2})*)?" . ')#ie';
$magic_url_replace[] = "make_clickable_callback(MAGIC_URL_FULL, '\$1', '\$2', '', '$class')";

// matches a "www.xxxx.yyyy[/zzzz]" kinda lazy URL thing
$magic_url_match[] = '#(^|[\n\t (>])(' . "www\.(?:[a-z0-9\-._~!$&'($inline*+,;=:@|]+|%[\dA-F]{2})+(?::\d*)?(?:/(?:[a-z0-9\-._~!$&'($inline*+,;=:@|]+|%[\dA-F]{2})*)*(?:\?(?:[a-z0-9\-._~!$&'($inline*+,;=:@/?|]+|%[\dA-F]{2})*)?(?:\#(?:[a-z0-9\-._~!$&'($inline*+,;=:@/?|]+|%[\dA-F]{2})*)?" . ')#ie';
$magic_url_replace[] = "make_clickable_callback(MAGIC_URL_WWW, '\$1', '\$2', '', '$class')";
}
return preg_replace($magic_url_match, $magic_url_replace, $text);

How can I rewrite these regex so that they only match links on my domain? Additionally, what is the best way to teach myself regex?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-09T23:09:37+00:00Added an answer on June 9, 2026 at 11:09 pm

    This is the first one, broken up section by section. Even doing this was non-trivial…

    (
        ^
    |
        [\n\t (>.]
    )
    

    OK, here we simply have "beginning of the line, or after a newline, tab, space, greater than, period. Just anchoring the regex.

    (
        [a-z]$scheme*:/{2}
    

    This is pure insanity right here. $scheme presumably holds http, which means that this regex matches the http://. Why someone would use /{2} instead of //, I cannot begin to guess.

        (?:
            (?:
                [a-z0-9\-._~!$&'($inline*+,;=:@|]+
            |
                %[\dA-F]{2}
            )+
        |
    

    This matches a series of characters, presumably those that are legal in a URL. Of note is the $inline PHP variable – can’t guess what that holds – and the second alternative, %[\dA-F]{2}. That matches things like %20 for a space, etc. The % sign is not otherwise legal in the match (or in a URL).

    Also important here is that / is not legal. This, therefore, cannot refer to directories, only to the domain. This is most likely the part you want to change, to simply match the appropriate domain of your website.

    For completeness’s sake, though, here’s the rest.

            [0-9.]+
        |
    

    Alternatively, we could have a series of digits and periods – an IP address. Considering how complicated this regex is, I’m surprised he didn’t go for (?:\d{1,3}\.){3}\d{1,3}…

            \[
            [a-z0-9.]+
            :
            [a-z0-9.]+
            :
            [a-z0-9.:]+
            \]
        )
    

    Here’s our last alternative; I think this is for IPv6. It’s a series of hexadecimal numbers separated by colons, anyway. It requires that these be within square brackets, which I find odd, especially for a forum software that uses those so heavily for tags…

        (?:
            :
            \d*
        )?
    

    Here, we get the option of some digits following a colon. That is, this is for URLs that have a port in them.

        (?:
            /
            (?:
                [a-z0-9\-._~!$&'($inline*+,;=:@|]+
            |
                %[\dA-F]{2}
            )*
        )*
    

    OK, here we’ve gotten to the subdirectories, as shown by the / at the beginning. Otherwise, this is the same "legal URL characters" match.

        (?:
            \?
            (?:
                [a-z0-9\-._~!$&'($inline*+,;=:@/?|]+
            |
                %[\dA-F]{2}
            )*
        )?
        (?:
            \#
            (?:
                [a-z0-9\-._~!$&'($inline*+,;=:@/?|]+
            |
                %[\dA-F]{2}
            )*
        )?
    )
    

    Finally, things that are being passed by GET, indicated by the \?, and URLs linking to a mid-page anchor, indicated by the \#.

    Bottom line:

    This section:

        [a-z]$scheme*:/{2}
        (?:
            (?:
                [a-z0-9\-._~!$&'($inline*+,;=:@|]+
            |
                %[\dA-F]{2}
            )+
        |
            [0-9.]+
        |
            \[
            [a-z0-9.]+
            :
            [a-z0-9.]+
            :
            [a-z0-9.:]+
            \]
        )
    

    Should be replaced with something like this:

        [a-z]$scheme*://
        www\.example\.com
    

    Or maybe

        [a-z]$scheme*://
        (?:
            www\.example\.com
        |
            192\.168\.0\.1
        |
            ::ffff:192\.168\.0\.1
        )
    

    Where the domain and the IP addresses match your website. Obviously, you’re going to have to remove the line breaks and indentation I did. I’d do it for you, but I think it’s almost not worth it because you’ll have a hard time finding the spot where you put your domain in the middle of all that.

    You’ll probably want to include some regex for subdomains or people leaving out the www. or what have you.

    You may also want to remove this:

        (?:
            :
            \d*
        )?
    

    As you probably don’t want people linking to other ports on your domain.

    The second one looks to have roughly the same structure; as the comment says, it’s just getting URLs that lack the protocol designator.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

That's pretty much it. I'm using Nokogiri to scrape a web page what has
I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
I know there's a lot of other questions out there that deal with this
I'm new to using the Perl treebuilder module for HTML parsing and can't figure
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I am reading a book about Javascript and jQuery and using one of the
I've got a string that has curly quotes in it. I'd like to replace
I'm using v2.0 of ClassTextile.php, with the following call: $testimonial_text = $textile->TextileRestricted($_POST['testimonial']); ... and
I am doing a simple coin flipping experiment for class that involves flipping a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.