Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7092751
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T08:21:13+00:00 2026-05-28T08:21:13+00:00

I started with the first code snippet below to search a list of lines

  • 0

I started with the first code snippet below to search a list of lines and convert all keywords (identified in a separate list) in each line to lower case. For my test list of lines about 800 lines long, the keyword substitution for the entire list of lines takes less than a second as long as my keyword list is 100 items or fewer. When I extend the list to 101 items or more, the processing time jumps to over 9 seconds.

Using the second snippet (where all the patterns for the keyword list are compiled) drops the total processing time back down below 1 second.

Does anyone know why the processing time for the non-compiled substitution search is so sensitive to the number of items searched per input line? I’m surprised it jumps so sharply after 100 keywords.

snippet #1

for line in lines_in:
    for keyword in keywords:
        rawstring = r'\b' + keyword + r'\b'
        line = re.sub(rawstring, keyword, line, 0, re.IGNORECASE)

snippet #2

for i in range(len(keywords)):
    re_pattern = re.compile(r'\b' + keywords[i] + r'\b', re.IGNORECASE)
    pattern.append(re_pattern)

for line in lines_in:
    for i in range(len(keywords)):
        line = pattern[i].sub(keywords[i], line, 0)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T08:21:14+00:00Added an answer on May 28, 2026 at 8:21 am

    This is because Python caches the compiled regex internally, and the size of that internal cache is 100 (as can be seen here on line 227. Furthermore, you can see on line 246-247 that when the cache get’s over the max size it is cleared rather than using a more advanced cache invalidation algorithm. This is essentially means that each iteration of your loop is blowing out the cache and causing all 100+ regexes to be recompiled.

    The performance is back to “normal” in your second example because it doesn’t rely on the internal cache staying intact to keep compiled regexes around.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

When I first started reading about Python, all of the tutorials have you use
I've been started studying PHP in my spare time, and the first code example
I started using EF 4.1 code first. I have a entity table like this:
When i first started using monotouch i found a page with some code samples
I have just started using EF Code First to implement a simple blog. I
Just started looking into Entity Framework and the new Code-First features. My question is
Just started learning mvc3. I've built a fairly basic website (also using EF-Code-First if
When I first started using revision control systems like CVS and SVN , I
When I first started programming, I wrote everything in main. But as I learned,
When I first started looking into Rails and Django I was steered away from

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.