Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 80103
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 10, 20262026-05-10T21:15:00+00:00 2026-05-10T21:15:00+00:00

I noticed a number of cases where an application or database stored collections of

  • 0

I noticed a number of cases where an application or database stored collections of files/blobs using a has to determine the path and filename. I believe the intended outcome is a situation where the path never gets too deep, or the folders ever get too full – too many files (or folders) in a folder making for slower access.

EDIT: Examples are often Digital libraries or repositories, though the simplest example I can think of (that can be installed in about 30s) is the Zotero document/citation database.

Why do this?

EDIT: thanks Mat for the answer – does this technique of using a hash to create a file path have a name? Is it a pattern? I’d like to read more, but have failed to find anything in the ACM Digital Library

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-10T21:15:01+00:00Added an answer on May 10, 2026 at 9:15 pm

    Hash/B:Tree

    A hash has the advantage of being faster to look at when you’re only going to use the ‘=’ operator for searchs.

    If you’re going to use things like ‘<‘ or ‘>’ or anything else than ‘=’, you’ll want to use a B:Tree because it will be able to do that kind of searchs.

    Directory structure

    If you have hundreds of thousands of files to store on a filesystem and you put them all in a single directory, you’ll get to a point where the directory inode will grow so fat that it will takes minutes to add/remove a file from that directory, and you might even get to the point where the inode won’t fit in memory, and you won’t be able to add/remove or even touch the directory.

    You can be assured that for hashing method foo, foo(‘something’) will always return the same thing, say, ‘grbezi’. Now, you use part of that hash to store the file, say, in gr/be/something. Next time you need that file, you’ll just have to compute the hash and it will be directly available. Plus, you gain the fact that with a good hash function, the distribution of hashes in the hash space is pretty good, and, for a large number of files, they will be evenly distributed inside the hierarchy, thus splitting the load.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I've noticed that when using $.post() in jquery that the default contentType is application/x-www-form-urlencoded
I've noticed that Internet Explorer adds a number in square brackets to files downloaded
I have noticed a number of kernel sources that look like this (found randomly
I noticed that Verilog rounds my real number results into integer results. For example
in my app I am displaying number strings and noticed that 7.000444 displays as
I noticed that IE 9 does not support application cache. How can I use
I noticed some people don't bother having the usual incremented number as ID but
I have written a java application that sporadically logs events to an SQLite database
We have a site which has a number of useful functions written by our
I was just reviewing a previous post I made and noticed a number of

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.