Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1007119
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T08:34:09+00:00 2026-05-16T08:34:09+00:00

Executive Summary: preg_replace() ran faster than string comparisons. Why? Shouldn’t regular expressions be slower?

  • 0

Executive Summary:

preg_replace() ran faster than string comparisons. Why? Shouldn’t regular expressions be slower?


In a recent question about detecting any of an array of disallowed substrings within a given input, I suggested comparing the result of a preg_replace() call to the original input, since preg_replace() can take an array of patterns as input. Thus my method for this could be a single if whereas the other solutions required one (or many) loops.

I’m not interested in debating my answer, because really it is less readable/maintainable than the loops. My answer there still holds a -1, and I’ll accept that for readability/ease of maintenance, but the biggest fault pointed out with my method was a lack of efficiency. That got me curious, and led me to do some testing. My results were a bit surprising to me: with all other factors held equal, preg_replace() was faster than any of the other methods.

Can you explain why this was the case?

My code for these tests can be found below, along with the results:

$input = "In a recent question about detecting any of an array of disallowed substrings within a given input, I suggested comparing the result of a `preg_replace()` call to the original input, since `preg_replace()` can take an array of patterns as input. Thus my method for this could be a single `if` whereas the other solutions required one (or many) loops. I'm not interested in debating my answer, because really it is less readable/maintainable than the loops. However, the biggest fault pointed out with my method was a lack of efficiency. That got me curious, and led me to do some testing. My results were a bit surprising to me: with all other factors held equal, `preg_replace()` was **faster** than any of the other methods. Can you explain why this was the case?";
$input2 = "Short sentence - no matches";
$input3 = "Word";
$input4 = "Short sentence - matches loop";

$start1 = microtime(true);
$rejectedStrs = array("loop", "efficiency", "explain");
$p_matches = 0;
for ($i = 0; $i < 10000; $i++) {
    if (str_check($rejectedStrs, $input)) $p_matches++;
    if (str_check($rejectedStrs, $input2)) $p_matches++;
    if (str_check($rejectedStrs, $input3)) $p_matches++;
    if (str_check($rejectedStrs, $input4)) $p_matches++;
}

$start2 = microtime(true);
$rejectedStrs = array("loop", "efficiency", "explain");
$l_matches = 0;
for ($i = 0; $i < 10000; $i++) {
    if (loop_check($rejectedStrs, $input)) $l_matches++;
    if (loop_check($rejectedStrs, $input2)) $l_matches++;
    if (loop_check($rejectedStrs, $input3)) $l_matches++;
    if (loop_check($rejectedStrs, $input4)) $l_matches++;
}

$start3 = microtime(true);
$rejectedStrs = array("/loop/", "/efficiency/", "/explain/");
$s_matches = 0;
for ($i = 0; $i < 10000; $i++) {
    if (preg_check($rejectedStrs, $input)) $s_matches++;
    if (preg_check($rejectedStrs, $input2)) $s_matches++;
    if (preg_check($rejectedStrs, $input3)) $s_matches++;
    if (preg_check($rejectedStrs, $input4)) $s_matches++;
}

$end = microtime(true);
echo $p_matches." ".$l_matches." ".$s_matches."\n";
echo "str_match: ".$start1." ".$start2."= ".($start2-$start1)."\nloop_match: ".$start2." ".$start3."=".($start3-$start2)."\npreg_match: ".$start3." ".$end."=".($end-$start3);

function preg_check($rejectedStrs, $input) {
    if($input == preg_replace($rejectedStrs, "", $input)) 
        return true;
    return false;
}

function loop_check($badwords, $string) {

    foreach (str_word_count($string, 1) as $word) {
        foreach ($badwords as $bw) {
            if (stripos($word, $bw) === 0) {
                return false;
            }
        }
    }
    return true;
}

function str_check($badwords, $str) {
    foreach ($badwords as $word) {
        if (stripos(" $str ", " $word ") !== false) {
            return false;
        }
    }
    return true;
}

Results

20000 20000 20000

str_match: 1282270516.6934 1282270518.5881= 1.894730091095

loop_match: 1282270518.5881 1282270523.0943=4.5061857700348

preg_match: 1282270523.0943 1282270523.6191=0.52475500106812

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T08:34:09+00:00Added an answer on May 16, 2026 at 8:34 am

    Let’s first look at preg_check and loop_check. Both of them will have to traverse the entire string, and they will have to check each of the individual words in each traversal. So their behavior will at least be O(n*m), where n is the length of the string and m the number of bad words. You can test this by running the algorithm with increasing values of n and m and plotting the 3D graphs (however, you may, or may not, have to run it with very high values of n and m to see this behavior).

    loop_check is more (asymptoticly) efficient here. The reason is that the number of words a string has is not proportional to their length — I seem to recall it typically follows a logarithmic function. It probably uses a hash table to store the words it finds through the way, which is done in average constant time (if we ignore that we may have to rebuild the hash table from time to time to accommodate more elements).

    Therefore loop_check will have an asymptotic behavior that follows something like n + m * log(n), which is better than n*m.

    Now, this refers to the asymptotic behavior of the algorithms, i.e., when m and n grow very (and it may require “very very”) large. For small values of m and n the constants play a big part. In particular, execution of PHP opcodes and PHP function calls are more costly than the same task implemented in C, just one function call away. This doesn’t make the regex algorithm faster, it just makes it faster for small values of m and n.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Executive summary: how do I install a new root certificate into Java using Java
Executive summary: I need a way to determine whether a Windows process I've spawned
Executive summary: How can I define an arbitrarily-sized 2D array in C? How can
Note : This question has been re-asked with a summary of all debugging attempts
Executive summary: *Improve the memory efficiency of the date-based extraction functions used in aggregate
Executive Summary: When assertion errors are thrown in the threads, the unit test doesn't
Executive Summary context.Response.Header.Add("Content-Type", "application/json"); gets ignored while context.Response.ContentType = "application/json"; breaks jQuery ajax calls.
Summary: I periodically get a .NET Fatal Execution Engine Error on an application which
Summary: We're having problems with EF4 query compilation times of 12+ seconds. Cached queries
Summary When I execute a very simple program using Perl's Benchmark utility. I get

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.