To begin with, I have this inexplicable affection to LINQ and lambda expressions 🙂
So I wrote a quite straightforward code using LINQ, which is supposed to get files from a directory according to a certain name pattern, order them and accumulate until the total length of accumulated files exceeds a certain threshold:
IEnumerable<FileInfo> l_allFiles = new DirectoryInfo(l_sDirName).GetFiles().Where(l_fileInfo => ms_pattern.IsMatch(l_fileInfo.Name)).OrderBy(l_fileInfo => l_fileInfo.CreationTime);
int l_nFilesTotal = l_allFiles.Count();
if (nFilesTotal > 0)
{
long l_nAccumulatedCmdLength = 0;
IEnumerable<FileInfo> l_selectedFiles = l_allFiles.TakeWhile(l_fileInfo => (l_nAccumulatedCmdLength += l_fileInfo.Length) <= Settings.Default.Threshold);
int l_nNumOfSelected = l_selectedFiles.Count();
if (l_nNumOfSelected > 0)
{
l_ret = new A { Files = l_selectedFiles };
}
}
Well, this code works fine when all found files together do not exceed the threshold.
As soon as not all found files are selected to l_selectedFiles, in most cases l_selectedFiles.Count() returns 0 even though l_selectedFiles is not empty.
In rare cases, when l_selectedFiles.Count() returns the correct value, the consecutive call to Files.Count() in the A class returns 0.
To add to the mistery, the debugger always updates the value of l_nAccumulatedLength not after execution of the TakeWhile() method, but after executing the next l_selectedFiles.Count() statement. In cases that not all files are selected, the value of l_nAccumulatedLength is not always as expected…
To complete the picture, I use Microsoft Visual Studio 2010 Ultimate on Windows 7 Professional and my project targets to .NET4.0.
Can anybody give an explanation or a hint to this behavior and/or how to fix it? Because I am quite lost and cannot even imagine how exactly I debug and resolve this issue, and the community is my last hope.
Thank you all in advance for your replies and comments.
You’re modifying a variable within a query:
Note the change to
l_nAccumulatedCmdLengthwithin yourTakeWhilecondition.That’s a really bad idea, which will end up with the sequence giving different results each time you evaluate it. Just don’t do it. I strongly suspect that’s the cause of the problem.
Note that this part:
… is very easily explained.
TakeWhiledoesn’t iterate over the sequence – it just builds a new sequence which will be lazily evaluated.If you want to get consistent results, use
ToList… but it would be much better not to modify a variable in the query in the first place. UseAggregateto create a sequence ofTuple<FileInfo, long>values where thelongvalue is the “size so far” if you want.