When recursively traversing through a directory structure, what is the most efficient algorithm to use if you have more files than directories? I notice that when using depth-first traversal, it seems to take longer when there are a lot of files in a given directory. Does breadth-first traversal work more efficiently in this case? I have no way to profile the two algorithms at the moment so your insights are very much welcome.
EDIT: In response to alphazero’s comment, I’m using PHP on a Linux machine.
It makes sense that breadth-first would work better. When you enter your root folder, you create a list of items you need to deal with. Some of those items are files and some are directories.
If you use breadth-first, you would deal with the files in the directory and forget about them before moving on to one of the child directories.
If you use depth-first, you need to keep growing a list of files to deal with later as you drill deeper down. This would use more memory to maintain your list of files to deal with, possibly causing more page faults, etc…
Plus, you’d need to go through the list of new items anyway to figure out which ones are directories that you can drill into. You would need to go through that same list (minus the directories) again when you’ve gotten to the point of dealing with the files.