I have the following extension method to split a List<T> into a list of List<T>‘s with different chunk sizes, but I’m doubting its efficiency. Anything I can do to improve it or is it fine as is?
public static List<List<T>> Split<T>(this List<T> source, params int[] chunkSizes)
{
int totalSize = chunkSizes.Sum();
int sourceCount = source.Count();
if (totalSize > sourceCount)
{
throw new ArgumentException("Sum of chunk sizes is larger than the number of elements in source.", "chunkSizes");
}
List<List<T>> listOfLists = new List<List<T>>(chunkSizes.Length);
int index = 0;
foreach (int chunkSize in chunkSizes)
{
listOfLists.Add(source.GetRange(index, chunkSize));
index += chunkSize;
}
// Get the entire last part if the total size of all the chunks is less than the actual size of the source
if (totalSize < sourceCount)
{
listOfLists.Add(source.GetRange(index, sourceCount - totalSize));
}
return listOfLists;
}
Example code usage:
List<int> list = new List<int> { 1,2,4,5,6,7,8,9,10,12,43,23,453,34,23,112,4,23 };
var result = list.Split(2, 3, 3, 2, 1, 3);
Console.WriteLine(result);
This gets a desired result and a has a final list part with 4 numbers as the total chunk size is 4 less than the size of my list.
I’m especially doubtful of the GetRange part as I fear this is just enumerating the same source over and over…
EDIT: I think I know a way to enumerate the source once: Just do a foreach on the source itself and keep checking if the number of iterated elements is the same as the current chunksize. If so, add the new list and go to the next chunksize. Thoughts?
There is no performance problem with this code.
GetRangeis documented to be O(chunkSize), and this is also easy to deduce since one of the most important properties ofList<T>is exactly that it allows O(1) indexing.That said, you could write a more LINQ-y version of the code like this: