I’m having to write an “immediate” mode implementation of Linq (due to memory allocation restrictions on Unity/Mono – long story, not really important).
I’m fine with everything performing as fast as or faster than real Linq until I come to ThenBy. Clearly my method for applying this is flawed as my performance drops to 4x slower that the real deal.
So what I’m doing right now is –
For each OrderBy, ThenBy clause
- Create a list of the for the results of each selector, add all of the results of the selector evaluation to the list
- Create a lambda that uses the default comparer which uses the list indexed off the two parameters
It looks like this:
public static IEnumerable<T> OrderByDescending<T,TR>(this IEnumerable<T> source, Func<T,TR> clause, IComparer<TR> comparer = null)
{
comparer = comparer ?? Comparer<TR>.Default;
var linqList = source as LinqList<T>;
if(linqList == null)
{
linqList = Recycler.New<LinqList<T>>();
linqList.AddRange(source);
}
if(linqList.sorter!=null)
throw new Exception("Use ThenBy and ThenByDescending after an OrderBy or OrderByDescending");
var keys = Recycler.New<List<TR>>();
keys.Capacity = keys.Capacity > linqList.Count ? keys.Capacity : linqList.Count;
foreach(var item in source)
{
keys.Add(clause(item));
}
linqList.sorter = (x,y)=>-comparer.Compare(keys[x],keys[y]);
return linqList;
}
public static IEnumerable<T> ThenBy<T,TR>(this IEnumerable<T> source, Func<T,TR> clause, IComparer<TR> comparer = null)
{
comparer = comparer ?? Comparer<TR>.Default;
var linqList = source as LinqList<T>;
if(linqList == null || linqList.sorter==null)
{
throw new Exception("Use OrderBy or OrderByDescending first");
}
var keys = Recycler.New<List<TR>>();
keys.Capacity = keys.Capacity > linqList.Count ? keys.Capacity : linqList.Count;
foreach(var item in source)
{
keys.Add(clause(item));
}
linqList.sorters.Add((z,x,y)=>z != 0 ? z : comparer.Compare(keys[x],keys[y]));
return linqList;
}
Then what I do in the sort function is create a lamda that applies the sorts in order – so I end up with a function that looks like a Comparer<int> and returns the correct ordering.
It starts this really poor performance. I’ve tried version using currying and different signatures for OrderBy and ThenBy functions, but nothing is really working any faster and I’m wondering if I’m just missing a trick about multikey sorting.
The sort variables and function:
public List<Func<int,int,int,int>> sorters = new List<Func<int, int, int, int>>();
public Func<int,int,int> sorter;
public List<int> sortList = new List<int>();
bool sorted;
private List<T> myList = new List<T>();
void ResolveSorters()
{
if(sorter==null)
return;
Func<int,int,int> function = null;
if(sorters.Count==0)
{
function = sorter;
}
else
{
function = sorter;
foreach(var s in sorters)
{
var inProgress = function;
var current = s;
function = (x,y)=>current(inProgress(x,y), x,y);
}
}
sortList.Capacity = sortList.Capacity < myList.Count ? myList.Count : sortList.Capacity;
sortList.Clear();
sortList.AddRange(System.Linq.Enumerable.Range(0,myList.Count));
//var c = myList.Count;
/*for(var i =0; i < c; i++)
sortList.Add(i);*/
sortList.Sort(new Comparison<int>(function));
sorted = true;
sorters.Clear();
}
I’ll need to guess but I’m still taking a shot at this. I think we should try getting rid of that nested lambda stuff and delegate conversions. I’m not sure how well that performs. The sort function should be this:
So we got rid of the nested invocations. All a simple loop now. You can build specialized versions for small loop sizes:
Unroll the loop so that no arrays appear anymore during the sort.
All of this is really working around the fact that we don’t have static knowledge of the sort function’s structure. It would be much faster if the comparison function was just handed in by the caller.
Update: Repro (100% more throughput than LINQ)