Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1086905
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T22:52:14+00:00 2026-05-16T22:52:14+00:00

Note: This is a follow up to this question . I have a legacy

  • 0

Note: This is a follow up to this question.

I have a “legacy” program which does hundreds of string matches against big chunks of HTML. For example if the HTML matches 1 of 20+ strings, do something. If it matches 1 of 4 other strings, do something else. There are 50-100 groups of these strings to match against these chunks of HTML (usually whole pages).

I’m taking a whack at refactoring this mess of code and trying to come up with a good approach to do all these matches.

The performance requirements of this code are rather strict. It needs to not wait on I/O when doing these matches so they need to be in memory. Also there can be 100+ copies of this process running at the same time so large I/O on startup could cause slow I/O for other copies.

With these requirements in mind it would be most efficient if only one copy of these strings are stored in RAM (see my previous question linked above).

This program currently runs on Windows with Microsoft compiler but I’d like to keep the solution as cross-platform as possible so I don’t think I want to use PE resource files or something.

Mmapping an external file might work but then I have the issue of keeping program version and data version in sync, one does not normally change without the other. Also this requires some file “format” which adds a layer of complexity I’d rather not have.

So after all of this pre-amble it seems like the best solution is to have a bunch arrays of strings which I can then iterate over. This seems kind of messy as I’m mixing code and data heavily, but with the above requirements is there any better way to handle this sort of situation?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T22:52:15+00:00Added an answer on May 16, 2026 at 10:52 pm

    I’m not sure just how slow the current implementation is. So it’s hard to recommend optimizations without knowing what level of optimization is needed.

    Given that, however, I might suggest a two-stage approach. Take your string list and compile it into a radix tree, and then save this tree to some custom format (XML might be good enough for your purposes).

    Then your process startup should consist of reading in the radix tree, and matching. If you want/need to optimize the memory storage of the tree, that can be done as a separate project, but it sounds to me like improving the matching algorithm would be a more efficient use of time. In some ways this is a ‘roll your own regex system’ idea. Rather similar to the suggestion to use a parser generator.

    Edit: I’ve used something similar to this where, as a precompile step, a custom script generates a somewhat optimized structure and saves it to a large char* array. (obviously it can’t be too big, but it’s another option)

    The idea is to keep the list there (making maintenance reasonably easy), but having the pre-compilation step speed up the access during runtime.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Note that this function does not have a { and } body. Just a
this is a follow on question to a previously asked question . I have
I have a programming question, as follows, for which my solution does not produce
I have a number of xml files that should follow this format: <root> <question>What
(Note: This is for MySQL's SQL, not SQL Server.) I have a database column
Note This is not a REBOL-specific question. You can answer it in any language.
(NOTE: This question is not about escaping queries, it's about escaping results) I'm using
Note: This is the opposite direction to most similar questions! I have an iPhone
Moderator note: This question is not a good fit for our question and answer
Let's say I have a simple stored procedure that looks like this (note: this

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.