I have written an in place mergesort algorithm for sorting a large set of data of random size (100,000 elements or more). I was thinking about putting in insertion sort for when the data is almost sorted to make the algorithm run a little bit faster. I was wondering if this is possible with in place mergesort?
Here is some of my code.
public static void merge(ArrayList<String> list, int low, int high) {
if (low < high) {
int mid = (low + high) / 2;
merge(list, low, mid);
merge(list, mid + 1, high);
mergeSort(list, low, mid, high);
}
}
public static void mergeSort(ArrayList<String> list, int first, int mid,
int last) {
int left = first;
int right = mid + 1;
String holder = "";
// if mid <= mid+1 skip merge
if (compareTo(list.get(mid), list.get(right)) <= 0) {
return;
}
while (left <= mid && right <= last) {
// if left index <= right index then just add to left
if (compareTo(list.get(left), list.get(right)) <= 0) {
left++;
} else {
holder = list.get(right);
copyList(list, left, right - left);//moves everything from left to right-left up one index in the arraylist
list.set(left, holder);
left++;
mid++;
right++;
}
}
// what is left is in place
}
public static void copyList(ArrayList<String> source, int srcPos, int length) {
String temp1 = "";
String temp2 = source.get(srcPos);
for (int i = 0; i < length; i++) {
temp1 = source.get(srcPos + 1);
source.set(srcPos + 1, temp2);
temp2 = temp1;
srcPos++;
}
}
Now, I was thinking of implementing Insertion sort by counter the number of elements when I first throw them into the arraylist and then changing my merge method to the following.
public static void merge(ArrayList<String> list, int low, int high) {
if(high-low==dataSize-1){
int mid = (low + high) / 2;
merge(list, low, mid);
merge(list, mid + 1, high);
insertionSort(list);
}else if (low < high) {
int mid = (low + high) / 2;
merge(list, low, mid);
merge(list, mid + 1, high);
mergeSort(list, low, mid, high);
}
}
However, this actually makes my algorithm to take an eternity. Im guessing I’m doing this wrong and the algorithm is taking n^2 to run since the data is completely randomly generated and no where close to almost sorted.
What am I doing wrong? Any suggestions? My guess is since its in place merge-sort it wont work.
Thanks!
Such algorithms are complicated and easy to get wrong. I implemented something very similar: an in-place stable merge sort. It also uses insertion sort for small sub-lists. I suggest to have a look at the source code and compare it with what you are doing. You might also be interested in a in-place stable quicksort.
Unless I’m mistaken your implementation is not stable (it might re-arrange elements that are equal). Depending on the use case, this may or may not be a problem.
Also, it seems your implementation is O(n^2) because the copyList method is O(n) and it is called n times.
About the insertionSort: what is
dataSizeand why do you compare it using equals? Don’t you want to use<instead? If you do, theelse if (low < high)is redundant (it is always true).