The question: better a deep folder structure or less subfolder with thousands files?
The problem:
I have a VB.NET program that generates around 2500 XML files per year (circa 100 KB per file).
I have to store the files on a file server (Windows 7 or NAS).
On the network there are around 30 PCs using that program.
I am looking for the best way to plan the structure of the folders on the file server with the goal to have a good human-readable folders structure and at the same time a fast access to the file.
In the past I made a similar program with the following structure:
\fileserver\PC1\year\months\file00001.xml
in other words a folder for each PC on the LAN
then a subfolder for the years
then a subfolder for the months
and in the month-folder the files generated in the current month
(of course the filename has a special stamp)
in this way I got nearly 200 files per months.
This program run since years without problem.
But now I would like to remove the subfolder “MONTH” in order to have all the files generated by PC in the current year together in the subfolder year, as
\fileserver\PC1\year\file00001.xml
this solution would produce a clearer folder tree, but more files per folder.
I do not know if this could be an issue in term of speed by file accessing with vb.net programs or other third hand application.
Which folder structure would you choose?
Thanks for replying.
If you use NTFS, then measurements show that flat structure will work faster than dealing with subdirectories, but the difference is minimal (maybe 1% or even less, I don’t have numbers at hand now).
Update: For one (single) file access less searches are involved and subdirectories offer better performance. But if you have random access to your files, then over time more and more files will be accessed and the OS will have to scan all directories and load them to memory. When it comes to processing large number of files, subdirectories tend to become slower. Also on NTFS, which has an index of file names, opening particular file is quite fast, and walking through subdirectories can be even slower, than opening the file from the same folder.
To summarize: speed significantly depends on usage scenario. I also believed, that grouping files into subdirectories would bring significant benefits, until I did tests. NTFS performed much better on hundreds of thousands of files in one folder, than one would expect. So I’d recommend making your own tests in your particular usage scenario.