Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7042653
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T02:11:40+00:00 2026-05-28T02:11:40+00:00

I have a program that runs a large number of regular expressions (10+) on

  • 0

I have a program that runs a large number of regular expressions (10+) on a fairly long set of texts (5-15 texts about 1000 words each)

Every time that is done I feel like I forgot a Thread.Sleep(5000) in there somewhere. Are regular expressions really processor-heavy or something? It’d seem like a computer should crank through a task like that in a millisecond.

Should I try and group all the regular expressions into ONE monster expression? Would that help?

Thanks

EDIT: Here’s a regex that runs right now:

Regex rgx = new Regex(@"((.*(\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*).*)|(.*(keyword1)).*|.*(keyword2).*|.*(keyword3).*|.*(keyword4).*|.*(keyword5).*|.*(keyword6).*|.*(keyword7).*|.*(keyword8).*|.*(count:\n[0-9]|count:\n\n[0-9]|Count:\n[0-9]|Count:\n\n[0-9]|Count:\n).*|.*(keyword10).*|.*(summary: \n|Summary:\n).*|.*(count:).*)", RegexOptions.Compiled | RegexOptions.IgnoreCase);

Regex regex = new Regex(@".*(\.com|\.biz|\.net|\.org|\.co\.uk|\.bz|\.info|\.us|\.cm|(<a href=)).*", RegexOptions.Compiled | RegexOptions.IgnoreCase);

It’s pretty huge, no doubt about it. The idea is if it gets to any of the keywords or the link it will just take out the whole paragraph surrounding it.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T02:11:41+00:00Added an answer on May 28, 2026 at 2:11 am

    Regexes don’t kill CPU’s, regex authors do. 😉

    But seriously, if regexes always ran as slowly as you describe, nobody would be using them. Before you start loading up silver bullets like the Compiled option, you should go back to your regex and see if it can be improved.

    And it can. Each keyword is in its own branch/alternative, and each branch starts with .*, so the first thing each branch does is consume the remainder of the current paragraph (i.e., everything up to the next newline). Then it starts backtracking as it tries to match the keyword. If it gets back to the position it started from, the next branch takes over and does the same thing.

    When all branches have reported failure, the regex engine bumps ahead one position and goes through all the branches again. That’s over a dozen branches, times the number of characters in the paragraph, times the number of paragraphs… I think you get the point. Compare that to this regex:

    Regex re = new Regex(@"^.*?(\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*|keyword1|keyword2|keyword3|keyword4|keyword5|keyword6|keyword7|keyword8|count:(\n\n?[0-9]?)?|keyword10|summary: \n).*$", 
        RegexOptions.Multiline | RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture);
    

    There are three major changes:

    • I factored out the leading and trailing .*
    • I changed the leading one to .*?, making it non-greedy
    • I added start-of-line and end-of-line anchors (^ and $ in Multiline mode)

    Now it only makes one match attempt per paragraph (pass or fail), and it practically never backtracks. I could probably make it even more efficient if I knew more about your data. For example, if every keyword/token/whatever starts with a letter, a word boundary would have an appreciable effect (e.g. ^.*?\b(\w+...).

    The ExplicitCapture option makes all the “bare” groups ((...)) act like non-capturing groups ((?:...)), reducing the overhead a little more without adding clutter to the regex. If you want to capture the token, just change that first group to a named group (e.g.(?<token>\w+...).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have program that runs fast enough. I want to see the number of
I have a program that runs osql.exe from microsoft sql server tools directory and
I have a program that runs in a few threads. The main thread shares
I have a program that runs correctly if I start it manually. However, if
I have a Java program that runs many small simulations. It runs a genetic
I have a Java program that runs on my Ubuntu 10.04 machine and, without
So I have a Linux program that runs in a while(true) loop, which waits
I have a cross platform program that runs on Windows, Linux and Macintosh. My
I have a Visual Studio 2005 C++ program that runs differently in Release mode
I have to write a daemon program that constantly runs in the background and

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.