This earlier SO question talks about how to retrieve all files in a directory tree that match one of multiple extensions.
eg. Retrieve all files within C:\ and all subdirectories, matching *.log, *.txt, *.dat.
The accepted answer was this:
var files = Directory.GetFiles('C:\\path', '*.*', SearchOption.AllDirectories) .Where(s => s.EndsWith('.mp3') || s.EndsWith('.jpg'));
This strikes me as being quite inefficient. If you were searching on a directory tree that contains thousands of files (it uses SearchOption.AllDirectories), every single file in the specified directory tree is loaded into memory, and only then are mismatches removed. (Reminds me of the ‘paging’ offered by ASP.NET datagrids.)
Unfortunately the standard System.IO.DirectoryInfo.GetFiles method only accepts one filter at a time.
It could be just my lack of Linq knowledge, is it actually inefficient in the way I mention?
Secondly, is there a more efficient way to do it both with and without Linq (without resorting to multiple calls to GetFiles)?
I shared your problem and I found the solution in Matthew Podwysocki’s excellent post at codebetter.com.
He implemented a solution using native methods that allows you to provide a predicate into his GetFiles implementation. Additionally he implemented his solution using yield statements effectively reducing the memory utilization per file to an absolute minimum.
With his code you can write something like the following:
And the files variable will point to an enumerator that returns the files matched (delayed execution style).