What would be the best directory structure for a massive amount of file.
Considering i have more than 20 million of files using number_id as file names (ex. 13842985.xml).
if would go with something like
filename : 13842985.xml
directory : 1/3/8/13842985.xml
How can i do this properly wherein all files are scattered evenly on each directories and subdirectories.
Change your method slightly to this instead:
I am assuming the filenames are somewhat random. This scheme will create 1000 top folders, each containing 1000 subfolders. By starting from the last digits instead of the first, you will be protected against long filenames:
Hope this helps!
Edit: By hashing the filename first and using this number instead, you’ll avoid degenerate cases (for example, varying only in the beginning).