Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6956807
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T14:55:03+00:00 2026-05-27T14:55:03+00:00

I have an application that fires several processes. Each process loads an HTML file

  • 0

I have an application that fires several processes. Each process loads an HTML file and tries to find whether a pattern appears in it, something like this:

OUTER:
while(my ($prov,$arr_ref) = each(%{$self->{TAGS}})) {
    foreach my $tag (@{$arr_ref}) {
        if ($html =~ m/\Q$tag\E/i) {
            $provider = $prov;
            last OUTER;
        }
    }
}

$self->{TAGS} key is a pattern name, and the value is a reference to array with strings (scalars).

I was profiling the program, and found that this part:

$html =~ m/\Q$tag\E/i

makes my CPU jump to 100%. If I remove it, it barely gets to 10%.

I have only one approach in mind, which is turning all the scalars (strings) inside each array ref to compiled regex (qr/.../). I guess it won’t improve it so much, since I guess the issue in fact when the regex actually searches all the HTML pages, which can be hundreds of bytes in size.

What can I do to improve this section?

SUB-QUESTION: due to the answers below,and some testing I made, I will sharpen my question, the issue is NOT the regex, I already tried the index way before I asked this question, also tried compiled regex with qr//, this issue is, with the size of the html files, the $html contents are HTML text, sometimes its small, and sometimes its big, so the issue here is: WHAT IS THE BEST WAY (Resource wise…) TO FIND IF A STRING APPEARS INSIDE A LARGER (LETS SAY 1MB IN SIZE) STRING?

Thanks.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T14:55:04+00:00Added an answer on May 27, 2026 at 2:55 pm

    Using index should increase performance since you’ll get rid of all the overhead of using regular-expressions. Please, do a benchmark!

    $html_searchable = lc ($html);
    
    ...    
    
    while ( ... ) {
      foreach ( ... ) {
        if (index ($html_searchable, lc ($tag)) > -1) {
          ... # we got a match
        }
      }
    }
    

    If you’d like to increase it even more you should store all your $tags as lowercase strings so that you don’t have to lc the same string multiple times.

    Documentation

    • index – perldoc.perl.org
    • lc – perldoc.perl.org
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have an application that may be run several times a day. Each run
In my application I have a drop-down form that fires up an AJAX request
I have an application that processes files in a directory and moves them to
I have an application that loads external SWF files and plays them inside a
I have an application that is modifying 5 identical xml files, each located on
I have an application that spans several pages of engagement for the applicant and
I have an application that is broken into several libraries for purposes of code
I have a jar application that contains several reports (files .jasper ) and the
I have an application that places several Windows within a com.extjs.gxt.desktop.client.Desktop. I need to
I have a WPF application that has several modal window used for various purposes.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.