Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 5957183
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 22, 20262026-05-22T18:21:52+00:00 2026-05-22T18:21:52+00:00

I have a huge (100k+ lines, 5MB+) XML which acts as a database for

  • 0

I have a huge (100k+ lines, 5MB+) XML which acts as a database for my C++ Application. The structure of the XML is quite straight forward, for example, it has chunks of:

<foo>
<bar prop="true"/>
<baz>blah</baz>
</foo>

The nesting of tags is several levels deep and there are many items with multiple properties. What is a good way to find and replace chunks of this kind of a file? For example, assume that the above section is repeated a few dozen times and in each chunk the value of the tag <baz> is different. I’d like to make edits such as:

  • Setting all the values contained in tag <baz> to a given value.
  • Remove chunks containing certain values
  • Etc.

So far, I’ve learnt of the following methods for accomplishing this:

  • Find/Replace: A no-brainer, trivial solution and also my last fall-back. This approach, IMHO is the most time consuming, error prone and painful method. The absolute last resort.

  • RegExes: Use regular expressions to match blocks of interest and edit them using replacement expressions. Kinda like this blog entry: http://blogs.msdn.com/b/vseditor/archive/2004/08/12/213770.aspx. But I feel this would be error prone and there could be a bunch of missed items if the regex is not exactly right the first time around.

  • Parser & Save: Whip up a quick program to parse the XML using Xerces or XML DOM Interfaces (or some other XML library), read the XML in, manipulate it as desired and save back to disk. Again, this approach is a slow process, but once its up and running, easy to make modifications and more flexible then RegExes.

Are there any better ways to deal with this?
(EDIT: Thanks for all the redo it to use a DB suggestions, I know its a huge mess but by “better ways to deal with this” I meant the “find/replace” part. )

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-22T18:21:52+00:00Added an answer on May 22, 2026 at 6:21 pm

    If you don’t want to put the entire document in memory, I would read it using a SAX parser. As you read it, you append the transformed document to a second (or a temp) file. I think it could be pretty fast, and use only a little memory footprint.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have huge 3D arrays of numbers in my .NET application. I need to
I have a huge database with 100's of tables and stored procedures. Using SQL
I have a huge database with some 100 tables and some 250 stored procedures.
I have huge number of Word files I need to merge (join) into one
I have a huge web app that is having issues with memory leak in
I have a huge file, where I have to insert certain characters at a
I have a huge text file (~1GB) and sadly the text editor I use
I have a huge dictionary of blank values in a variable called current like
I have a huge file that I must parse line by line. Speed is
I have a huge mbox file, with maybe 500 emails in it. It looks

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.