Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1086905
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T22:52:14+00:00 2026-05-16T22:52:14+00:00

Note: This is a follow up to this question . I have a legacy

  • 0

Note: This is a follow up to this question.

I have a “legacy” program which does hundreds of string matches against big chunks of HTML. For example if the HTML matches 1 of 20+ strings, do something. If it matches 1 of 4 other strings, do something else. There are 50-100 groups of these strings to match against these chunks of HTML (usually whole pages).

I’m taking a whack at refactoring this mess of code and trying to come up with a good approach to do all these matches.

The performance requirements of this code are rather strict. It needs to not wait on I/O when doing these matches so they need to be in memory. Also there can be 100+ copies of this process running at the same time so large I/O on startup could cause slow I/O for other copies.

With these requirements in mind it would be most efficient if only one copy of these strings are stored in RAM (see my previous question linked above).

This program currently runs on Windows with Microsoft compiler but I’d like to keep the solution as cross-platform as possible so I don’t think I want to use PE resource files or something.

Mmapping an external file might work but then I have the issue of keeping program version and data version in sync, one does not normally change without the other. Also this requires some file “format” which adds a layer of complexity I’d rather not have.

So after all of this pre-amble it seems like the best solution is to have a bunch arrays of strings which I can then iterate over. This seems kind of messy as I’m mixing code and data heavily, but with the above requirements is there any better way to handle this sort of situation?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T22:52:15+00:00Added an answer on May 16, 2026 at 10:52 pm

    I’m not sure just how slow the current implementation is. So it’s hard to recommend optimizations without knowing what level of optimization is needed.

    Given that, however, I might suggest a two-stage approach. Take your string list and compile it into a radix tree, and then save this tree to some custom format (XML might be good enough for your purposes).

    Then your process startup should consist of reading in the radix tree, and matching. If you want/need to optimize the memory storage of the tree, that can be done as a separate project, but it sounds to me like improving the matching algorithm would be a more efficient use of time. In some ways this is a ‘roll your own regex system’ idea. Rather similar to the suggestion to use a parser generator.

    Edit: I’ve used something similar to this where, as a precompile step, a custom script generates a somewhat optimized structure and saves it to a large char* array. (obviously it can’t be too big, but it’s another option)

    The idea is to keep the list there (making maintenance reasonably easy), but having the pre-compilation step speed up the access during runtime.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

NOTE: This is a followup to my question here. I have a program that
note: this is a direct follow up to this previous question I have very
Note: This question has broadened in scope from previous revisions. I have tried to
( NOTE: This is a follow up to a previous question, How to pass
My question (which will follow after this, sorry about the long intro, the question
Note: this question is a bit long I have a PHP-based system with Service-Dao-Model
I have a number of xml files that should follow this format: <root> <question>What
This is a follow up to this question . Have spent days on this
This is a follow up to this question: dropdownlist to combobox I have a
This is a follow up of another question here on SO . I have

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.