I’m investigating and performance testing various ways of randomizing ordered collections, and I was looking at the option of passing a Comparison delegate that just randomly returns the comparison result. For example:
int RandomComparison<T> (T x, T y)
{
return this.random.Next (-1, 2);
}
However, as I do not know the sorting algorithm used under the hood, I do not know if this could possibly lead to the sort logic never completing. While this method seems generally unreliable performance-wise, I’m wondering if it is actually dangerously unusable?
List.Sortin fact is documented to use QuickSort (no additional details given), but I’ll ignore that in favour of talking about sorting in general…I suspect that for any sensible sort algorithm, this comparator results in the operation terminating with probability 1, in the sense that the probability of it lasting N steps tends to 0 as N tends to infinity. In fact I think in a lot of cases there’s a hard upper limit.
The reason is that it’s possible (occurs with non-zero probability) that the random comparator could just so happen to return consistent results for enough comparisons in a row (O(n log n) of them, perhaps, or O(n^2) for insertion sort) that the algorithm “thinks” it’s finished. As long as this happens eventually, I’d kind of expect it to terminate the sort.
However, I can’t be certain of this, because it’s by no means impossible that the algorithm has got into an unrecoverable state before this brief period of consistency. I just have a hunch that for practical sort algorithms, that won’t happen. And indeed for a lot of algorithms the problem will inexorably get smaller regardless of how nonsensical the comparator is, hence yielding a hard limit. QuickSort in particular works by repeatedly partitioning the array – if the comparator is random then this will result in nonsensical partitions, but as long as no actual errors occur due to the inconsistency, the array will still be divided into two parts for recursion. The fact that the elements in the “top” part won’t compare greater than the pivot on a second time of asking quite likely doesn’t matter at all, since they’ll never be compared with it again.
Anyway, it might take a very long time, long enough to constitute “dangerous” for most practical purposes. A bubble sort, for example, would just keep stirring things around until it gets
nnegative results in a row to complete a sweep without moving anything. That would take expected time2^nor thereabouts (not thatList.Sortin particular could be a bubble sort, I use it only to illustrate what might happen for sorting in general). And depending on implementation details, you might find that you access out of bounds, due to the “impossible” happening, more times than you “finish”.The operation certainly doesn’t necessarily randomize the collection with all permutations equally likely. Again, note that QuickSort chooses a pivot and then partitions around it. Given this random comparator (and, again, assuming no actual errors like out-of-bounds access), the probability that the first-chosen pivot ends up at the far right hand side of the array is 1 in 2^(n-1) (since all other elements must be sorted left of it, a half chance for each), whereas in a uniformly-randomly-selected permutation, that probability should be 1 in n. The first chosen pivot is distributed in the array on a bell curve, when it “should” be uniform.