Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7431299
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T09:16:27+00:00 2026-05-29T09:16:27+00:00

My C# application loops over 5000 files and then writes the values of xpaths

  • 0

My C# application loops over 5000 files and then writes the values of xpaths to cells in an excel sheet. It is quite slow processing 40files a second.

After profiling I discovered that this line accounts for over 50% of all time used:

XmlDocument.Load(filename);

To write to excel i loop over each xpath of each file and do:

worksheet.Cells[row, col] = value;

Would it be more beneficial in terms of speed to load all the xmls into memory at once (they are less than 20kb each) then store them in a collection then transpose them all to excel?

I understanding that multi-threading would possibly reduce performance rather than improve it as the process is IO-bound.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T09:16:28+00:00Added an answer on May 29, 2026 at 9:16 am

    It might not be IO bound. Most of the time is spent constructing the XML DOM. However, multi-threading would introduce a possible issue, depending on where you’re writing the results to Excel. I don’t know for sure, but I wouldn’t be surprised if you could only access the Office objects from a single thread.

    You would have to add an additional step of collecting the results before writing to the Excel object. This would have to be some sort of synchronized collection, with either another thread dedicated to writing to Excel, or do it after all of the files are processed.

    Now, going back to the first point: Most of the time is spent loading the DOM. Based on the results from http://www.nearinfinity.com/blogs/joe_ferner/performance_linq_to_sql_vs.html If you still need DOM related methods, I would look at using XDocument instead. The interface isn’t that far off XmlDocument, so it should be an easy adaption.

    For the most speed processing XML, look into XmlReader. However, this does not get you any DOM functions, and can be harder to deal with than the two DOM based methods.

    So, in short, first try converting to the XDocument methods, that might roughly double your speed. I would then look at converting the processing to multithreaded (perhaps using PLINQ over the list of files). Finally, if performance is still not enough, try using the XmlReader interface.

    EDIT in response to collection types to use:

    I see two basic options for this, depending on how long it takes to process the XML files. If it is a small percentage of the overall process (most time is spent dealing with Excel), just have a List<T> where T is some representation of the data you need to write to excel (It could even be a string if that’s all you need), with the .Add methods surrounded by lock‘s. Then once XML processing is complete, the Excel writer iterates over this collection.

    Another option if XML processing takes awhile, and you’re on .Net 4, look at the ConcurrentQueue class. This will provide thread safety on it’s own (and really now that I look, one of the Concurrent collections could be used in the first case too, either ConcurrentQueue or BlockingCollection). You would then have threads running processing XML, and then a consumer thread that writes out to Excel.

    A few other things. Expanding a comment on a question, if you’re doing nothing that needs Excel specific functions, you could just write out to CSV. The library here http://www.codeproject.com/Articles/86973/C-CSV-Reader-and-Writer is rather straightforward to use, and handles embedded commas. The downside of this is the Big Scary Dialogs excel throws up if you try to save a CSV. These might be overcome with user training, however.

    Another option would be to use the OpenXML library to generate Excel files if you’re targeting at least Excel 2007 (Although Excel 2003 can read xlsx files with an addin), provided you aren’t already. I imagine that, since this library manipulates XML it would be faster than dealing with Excel interop, and also safer (no dialogs from Excel, no zombie processes, etc).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have an application that writes many times to a formula/macro-laden workbook. It loops
I am currently trying to create an Android application that loops Audio from the
in a ASP.NET application (MVC) I have a foreach loop that loops through a
I have a pair of loops over a nested array object in scala def
My application download a package with images over HTTP. They are stored in the
My ASP Classic application are fetching over 10.000 rows via a triple inner join.
Over the years my application has grown from 1MB to 25MB and I expect
I've recently taken over development of a classic ASP application after the sole developer
I'm using JMagick and have a simple Java class that loops over all images
I'm using GoblinXNA to create an AR application. I copied EVERYTHING over from Tutorial

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.