Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7432591
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T09:28:55+00:00 2026-05-29T09:28:55+00:00

I need to process incoming xml files (they will be created by other application

  • 0

I need to process incoming xml files (they will be created by other application directly in specific folder) and I need to do it fast.

There can be up to 200 000 files per day and my current assumption is to use .NET 4 and tpl.

My current service concept is:

In a loop I want to check folder for new files, if I find any of them, I will put them to queue, which will be processed by another loop which will take files from queue and create for each of them new task(thread). Number of simultaneous tasks should be configurable.
First part is easy but creating two main loops with queue between them is something new for me.

And the question:
How to create two loops(one for checking folder and adding files and second for taking files from queue and process them parallel) and add queue to communicate between them.

For first part (folder checking) suggested solution is to use FileSystemWatcher. Now second part needs to be discussed (maybe some Task Scheduler).

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T09:28:56+00:00Added an answer on May 29, 2026 at 9:28 am

    May not need loops, not sure parallel is necessary either. That would be useful if you want to process a batch of new files.
    FileSystemWatcher on the folder where new files will appear, will give you an event to add a file to the queue.

    Add an event for item added to queue, to trigger a thread to process an individual file.

    If you knock up a simple class, File, state, detected time etc.

    You’d have a detection thread adding to the queue, a threadpool to process them and on success remove them from the queue.

    You might find this previous question useful threasafe “lists” in .net 4

    Thread-safe List<T> property

    Particularly if you want to process all new files since X.

    Note if you aren’t going to use FileSystem watcher and just get files from the folder, a Processed folder to move them to and maybe a Failed Folder as well, would be a good idea. Reading 200,00 filenames in to check to see if you’ve processed them would sort of remove any benefit from parallel processing them.

    Even if you do, I’d recomend it. Just moving it back in to To Process (or after an edit in case of failures) will trigger it to be reprocessed. Another advantage is say if you are processing into a database and it all goes nipples up and your last back up was at X. You restore and then simply move all the files you did process back into the “toprocess” folder.

    You can also do test runs with known input and check the db’s state before and after.

    Further to comment.

    ThreadPool which is used by Task has a ThreadPool limit put that’s for all for or background tasks in yor app.

    After comment.

    If you want to limit the number of concurrent tasks…

    Starter for ten you can easily improve upon, for tuning and boosting.

    In your class that manages kicking off tasks from the file queue, something like

    private object _canRunLock;
    private int _maxTasks;
    private int _activeTasks;
    
    public MyTaskManager(int argMaxTasks)
    {
      _maxTasks = argMaxTasks;
      _canRunLock = new object();
      _activeTasks = 0;
    }
    
    
    public bool CanRunTask(MyTask argTask)
    {
      get
      {
        lock(_canRunLock)
        {
          if (_activeTasks < _maxTasks)
          {
            ExecuteTask(argTask);
            _activeTasks++;
            return true;
          }
        }
        return false;
      }
    }
    
    public void TaskCompleted()
    {
      lock(_canRunLock)
      {
        if (_activeTasks > 0)
        {
          _activeTasks--;
        }
        else
        {
          throw new WTFException("Okay how did this happen?");
        }
      }
    }
    

    Simple and safe (I think). You could have another property pause or disable to check as well. Might want to make the above a singleton ( 🙁 ), or at least bear in mind that what if you run more than one….

    Best advice I can give is start simple, open and decoupled, and then complicate as necessary, be easy to start optimising prematurely here. A good idea not to have a load a of threads all waiting on say the FileSystem, or a backend, but I doubt number of processors is ever going to be a bottleneck, so your maxTasks is a bit thumb in the air.
    Some sort of self tune between a lower and upper limit might be a good thing as opposed to one fixed number.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a quite big XML output from an application. I need to process
I need to create a background process that will wait for incoming commands and
I need to process the incoming predefined ASN format data(coming from verity of clients
Parts of my application are in C++ under windows. I need the process id
I need to use applescript to process the body of an email during incoming.
I'm designing a library that will be used to intercept and process incoming messages
I need to concurrently process a large amount of files (thousands of different files,
I need to concurrently process a large amount of files (thousands of different files,
I am facing a situation where I need to process an incoming email which
I need to process files which get uploaded and it can take as little

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.