Edit: Ok, since it’s clear I’m taking the wrong approach with this, I’ll explain what I was intending to do. The overall intent is to (as an exercise) verify all valid email addresses according to spec. This portion was to generate a portion of the data-set to verify the algorithm against.
As an exercise, I’m writing a program that will generate all possible email addresses. This will result in 808165 ≈ 1.4e122 possible items. I’m currently using List<T>s to store the generated items but my understanding is that it has a maximum capacity of Int32.MaxValue. I’m guessing a proper solution isn’t going to involve Lists of Lists of Lists. This is what I have so far.
private void GenerateLocalPart()
{
List<string> validLocalSymbols = new List<string>()
{
".", "!", "#", "$", "%", "&", "*", "+", "-",
"/", "^", "_", "`", "{", "|", "}", "~", "\"",
};
List<string> validLocalNumbers = new List<string>()
{
"0", "1", "2", "3", "4", "5", "6", "7", "8", "9",
};
List<string> validLocalLowercase = new List<string>()
{
"a", "b", "c", "d", "e", "f", "g", "h", "i", "j",
"k", "l", "m", "n", "o", "p", "q", "r", "s", "t",
"u", "v", "w", "x", "y", "z",
};
List<string> validLocalUppercase = new List<string>()
{
"A", "B", "C", "D", "E", "F", "G", "H", "I", "J",
"K", "L", "M", "N", "O", "P", "Q", "R", "S", "T",
"U", "V", "W", "X", "Y", "Z",
};
List<string> validLocalPartCharacters = new List<string>();
validLocalPartCharacters.AddRange(validLocalSymbols);
validLocalPartCharacters.AddRange(validLocalNumbers);
validLocalPartCharacters.AddRange(validLocalLowercase);
validLocalPartCharacters.AddRange(validLocalUppercase);
List<string> targetSequence = validLocalLowercase;
int lengthOfStringToGenerate = 5;
int numberOfDifferentSourceCharacters = targetSequence.Count;
List<List<string>> localPart = new List<List<string>>();
List<string> localPartSeed = new List<string>();
localPart.Add(localPartSeed);
foreach (string character in targetSequence)
localPartSeed.Add(character);
for (int i = 1; i < lengthOfStringToGenerate; i++)
{
List<string> bufferList = new List<string>();
localPart.Add(bufferList);
foreach (string lastListString in localPart[i - 1])
foreach (string character in targetSequence)
bufferList.Add(lastListString + character);
}
Console.WriteLine("Break here.");
}
lengthOfStringToGenerate is a maximum length of the strings (so it generates all combinations from 1 to lengthOfStringToGenerate). localPart will end up with an amount of Lists equivalent to the lengthOfStringToGenerate. Is there a different type of collection that I should be using? Is there a different overall approach I should be taking?
Where were you expecting to store all this data?
List<T>will always store its values in memory… but even if you write something to store the results to disk, you’re still not going to be able to hold 1.4e122 items. Have you really taken in just how big that number is? Even at a single bit per item, that’s way more than the capacity of the universe, if the whole of the universe was one big hard disk.The largest unit of data I’ve ever heard of being talked about in a meaningful way is an exabyte, which is 1018 bytes. For most people, a petabyte (1015 bytes) is a pretty huge amount of data. What you’re considering makes those quantities seem microscopically small.
What are you trying to do with the data afterwards? And when would you expect such an algorithm to ever actually finish?