Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 319123
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T08:37:57+00:00 2026-05-12T08:37:57+00:00

I have a process that’s going to initially generate 3-4 million PDF files, and

  • 0

I have a process that’s going to initially generate 3-4 million PDF files, and continue at the rate of 80K/day. They’ll be pretty small (50K) each, but what I’m worried about is how to manage the total mass of files I’m generating for easy lookup. Some details:

  1. I’ll have some other steps to run once a file have been generated, and there will be a few servers participating, so I’ll need to watch for files as they’re generated.
  2. Once generated, the files will be available though a lookup process I’ve written. Essentially, I’ll need to pull them based on an order number, which is unique per file.
  3. At any time, an existing order number may be resubmitted, and the generated file will need to overwrite the original copy.

Originally, I had planned to write these files all to a single directory on a NAS, but I realize this might not be a good idea, since there are millions of them and Windows might not handle a million-file-lookup very gracefully. I’m looking for some advice:

  1. Is a single folder okay? The files will never be listed – they’ll only be retrieved using a System.IO.File with a filename I’ve already determined.
  2. If I do a folder, can I watch for new files with a System.IO.DirectoryWatcher, even with that many files, or will it start to become sluggish with that many files?
  3. Should they be stored as BLOBs in a SQL Server database instead? Since I’ll need to retrieve them by a reference value, maybe this makes more sense.

Thank you for your thoughts!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T08:37:57+00:00Added an answer on May 12, 2026 at 8:37 am

    I’d group the files in specific subfolders, and try to organize them (the subfolders) in some business-logic way. Perhaps all files made during a given day? During a six-hour period of each day? Or every # of files, I’d say a few 1000 max. (There’s probably an ideal number out there, hopefully someone will post it.)

    Do the files ever age out and get deleted? If so, sort and file be deletable chunk. If not, can I be your hardware vendor?

    There’s arguments on both sides of storing files in a database.

    • On the one hand you get enhanced security, ’cause it’s more awkward to pull the files from the DB; on the other hand, you get potentially poorer performance, ’cause it’s more awkward to pull the files from the DB.
    • In the DB, you don’t have to worry about how many files per folder, sector, NAS cluster, whatever–that’s the DB’s problem, and probably they’ve got a good implementation for this. On the flip side, it’ll be harder to manage/review the data, as it’d be a bazillion blobs in a single table, and, well, yuck. (You could partition the table based on the afore-mentioned business-logic, which would make deletion or archiving infinitely easier to perform. That, or maybe partitioned views, since table partitioning has a limit of 1000 partitions.)
    • SQL Server 2008 has the FileStream data type; I don’t know much about it, might be worth looking into.

    A last point to worry about is keeping the data “aligned”. If the DB stores the info on the file along with the path/name to the file, and the file gets moved, you could get totally hosed.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I need to have a process that compiles daily data into an PDF that
We have a process that runs a few times a day (via a Windows
I have 1 process that receives incoming connection from port 1000 in 1 linux
We have a process that needs to run every two hours. It's a process
Imagine I have a process that starts several child processes. The parent needs to
I challenge you :) I have a process that someone already implemented. I will
I have an ETL process that involves a stored procedure that makes heavy use
I have a Java process that opens a file using a FileReader. How can
I have an archiving process that basically deletes archived records after a set number
I have a process x that I want to check for leaks with valgrind

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.