intro
divide-and-conquer
Mergesort is based on a simple operation known as merging: combining two ordered arrays to make one larger ordered array. This operation immediately leads to a simple recursive sort method known as mergesort: to sort an array, divide it into two halves, sort the two halves (recursively), and then merge the results. As you will see, one of mergesort’s most attractive properties is that it guarantees to sort any array of N items in time proportional to N log N. Its prime disadvantage is that it uses extra space proportional to N.
use a third array
The straightforward approach to implementing merging is to design a method that merges two disjoint ordered arrays of Comparable objects into a third array. This strategy is easy to implement: create an output array of the requisite size and then choose successively the smallest remaining item from the two input arrays to be the next item added to the output array.
However, when we mergesort a large array, we are doing a huge number of merges, so the cost of creating a new array to hold the output every time that we do a merge is problematic. It would be much more desirable to have an in-place method so that we could sort the first half of the array in place, then sort the second half of the array in place, then do the merge of the two halves by moving the items around within the array, without using a significant amount of other extra space. It is worthwhile to pause momentarily to consider how you might do that. At first blush, this problem seems to be one that must be simple to solve, but solutions that are known are quite complicated, especially by comparison to alternatives that use extra space.
complexity
Proposition F. Top-down mergesort uses between ½ Nlg N and NlgN compares to sort any array of length N.(未优化的版本)
Proposition G. Top-down mergesort uses at most 6NlgN array accesses to sort an array of length N.(未优化的版本)
Proof: Each merge uses at most 6N array accesses (2N for the copy(每次copy整个数组,读 算一次,写 算一次,其实copy一半就行), 2N for the move back, and at most 2N for compares). The result follows from the same argument as for PROPOSITION F.
optimization
Use insertion sort for small subarrays
Test whether the array is already in order
We can reduce the running time to be linear for arrays that are already in order by adding a test to skip the call to merge() if a[mid] is less than or equal to a[mid+1]. With this change, we still do all the recursive calls, but the running time for any sorted subarray is linear.
golang实现
func merge(nums *[]int, start, mid, end int) {
//tempNums := (*nums)[start : mid+1] //这种方式指向同一个底层数组,会导致问题,需要copy
tempNums := make([]int, mid-start+1)
copy(tempNums, (*nums)[start:mid+1])
left := start
right := mid + 1
if (*nums)[mid] <= (*nums)[mid+1] {
return // left already less than right
}
for i := start; i <= end; i++ {
if left <= mid && right <= end {
if tempNums[left-start] <= (*nums)[right] {
(*nums)[i] = tempNums[left-start]
left++
} else {
(*nums)[i] = (*nums)[right]
right++
}
} else if left <= mid {
(*nums)[i] = tempNums[left-start]
left++
} else if right <= end {
(*nums)[i] = (*nums)[right]
right++
}
}
}
func MergeSort(nums *[]int) {
mergeSort(nums, 0, len(*nums)-1)
}
//top down
func mergeSort(nums *[]int, start, end int) {
if start == end {
return
}
mid := (end + start) / 2 //mid index, included in the left slice
mergeSort(nums, start, mid)
mergeSort(nums, mid+1, end)
merge(nums, start, mid, end)
}
Bottom-up mergesort
//bottom up
func mergeSortBU(nums *[]int) {
n := len(*nums)
//sz:待合并的两个数组的大小
for sz := 1; sz < n; sz = 2 * sz {
for lo := 0; lo < n-sz; lo += 2 * sz {
merge(nums, lo, lo+sz-1, int(math.Min(float64(lo+2*sz-1), float64(n-1))))
}
}
}
complexity
Bottom-up mergesort uses between ½ NlgN and NlgN compares and at most 6NlgN array accesses to sort an array of length N.(和top-down一样)
总结
Both the top-down and bottom-up approaches to implementing a divide-and-conquer algorithm are intuitive. The lesson that you can take from mergesort is this: Whenever you encounter an algorithm based on one of these approaches, it is worth considering the other.