I have a very simple text file parsing app which searches for an email address and if found adds to a list.
Currently there are duplicate email addresses in the list and I’m looking for a quick way of trimming the list down to only contain distinct values – without iterating over them one by one 🙂
Here’s code –
var emailLines = new List<string>();
using (var stream = new StreamReader(@"C:\textFileName.txt"))
{
while (!stream.EndOfStream)
{
var currentLine = stream.ReadLine();
if (!string.IsNullOrEmpty(currentLine) && currentLine.StartsWith("Email: "))
{
emailLines.Add(currentLine);
}
}
}
Try the following
The downside to this approach is that it reads all of the lines in the file into a
string[]. This happens immediately and for large files will create a correspondingly large array. It’s possible to get back the lazy reading of lines by using a simple iterator.The
File.ReadAllLinescall above can then just be replaced with a call to this function