My PHP project will use thousands of pictures and each needs only a single number for it’s storage name.
My initial idea was to put all of the pictures in a single directory and name the files “0.jpg”, “1.jpg”, “2.jpg”, and all the way to “4294967295.jpg” .
Would it be better performance-wise to create a directory tree structure and name the files something like “429 / 496 / 7295.jpg”?
If the answer is yes, then the follow up question would be: what is the optimal amount of subdirs or files per level of depth? And what effect does the chosen filesystem have on this?
Each file will have a corresponding MySQL entry with an UNSIGNED LONGINT id-number.
Thank you.
It depends on which filesystem is being used. ext{2,3,4} have a dir_index option that can be set when they are created that make storing thousands or even millions of files in a single directory reasonably fast.
btrfs is not yet production ready, but it implicitly supports this idea at a very basic level.
But if you’re using the ext series without dir_index or most other Unix filesystems you will need to go for the more complex scheme of having several levels of directories. I would suggest you avoid that if you can. It just adds a lot of extra complication for something filesystems ought to be handling reasonably for you.
If you do use the more complex scheme, I would suggest encoding the number in hex and having 256 files/directories at each level. Filesystems that aren’t designed to handle large numbers of files in each directory typically do linear scans. The goal is to approximate a B-Tree type structure on your own. 2 hex digits at each level gives you about half a 4kiB (a common size) disk block per level with common means of encoding directories. That’s about as good as you’re going to get without a really complicated scheme like encoding your numbers in base 23 or base 24.