Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6107517
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T14:11:49+00:00 2026-05-23T14:11:49+00:00

On a single filesystem, i need to store 1 Billion 1KB text files. Every

  • 0

On a single filesystem, i need to store 1 Billion 1KB text files. Every file has an unique id string and it should be performance-optimized.
What is the best?

EXT4: (example file structure for filename: kdWqpGQ1)

/kd/Wq/pG/Q1.file

or

/kdWqpGQ1.file

Or should i avoid this and use some kind of non-relational database?

Also, i can always share the 5TB volume i have into 5*1TB hard drives, having than 200M files each. I want to add that 1B files is a limit case, i will most probably reach only 500M.

Thank you!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T14:11:49+00:00Added an answer on May 23, 2026 at 2:11 pm

    Your first option is much faster.

    Think of a directory in a file system like a a text file with an unsorted list of all files in this directory with an address where to find the file on the disk. To read a file you need to know the address of the file on the disk. If you have a path like ‘/myfilename’, then you need to find the file / which is a directory and contains all files in this directory. Than you need to scan this file for the entry ‘myfilename’, which may in worst case require you to traverse the entire file. In average case that will take O(N/2) while N is apperently 1 billion (the number of total files in this directory).

    If you have multiple directories… Say always 1000 files in a directory so that you have 3 levels of directorys and your filepath is now /A/B/myfilename, then you will need to first open the / directory, find A (which requires O(1000/2), open that file and find B (O(1000/2) again) and open that file again to find myfilename (yet again O(1000/2)). So adding those up will be 3*O(1000/2) = 1500, which is MUCH faster than the O(500.000.000) that we had previously.

    This is a very important aspect of file systems to always keep in mind. If you have a directory that may run into danger to exceed having 10.000 files stored in it, I’d strongly recommend to think about a strategy to sort those files into subdirectories.

    Whether you should better use a relational database depends on other questions: Do you need backups (to be created concurrently)? Do you need transactions beyond what simple journaling file systems offer? Do you need concurrency control? Do you need to search your through your files? How often do you need to access the files? How often do you change your files?

    For further readings on file systems I recommand the book modern operating system by Tanenbaum (chapter 6 “File systems”), that is available online here: http://lovingod.host.sk/index.html?page=tanenbaum%2FOperating-Systems-Design.html

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Translating a subroutine from VB6 to VB2005. It downloads a single text file via
Using Ruby (1.9.3) I need to replace a single file in a zip archive.
What single aspect of agile development should we implement first to improve our development
A single data structure to store different types in a single variable. Not asking
Given the URL (single line): http://test.example.com/dir/subdir/file.html How can I extract the following parts using
I need a way to remove unused images from my filesystem, i.e. images that
I need to blindly (i.e. without access to the filesystem, in this case the
I'm performing I/O to a single file from multiple threads. Access to this shared
I need to transfer DVD image files between a Windows XP computer and a
I've been learning some Clojure, and I currently have a single .clj file which

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.