Sort | gavincode

Selection sort

In iteration i, find index min of smallest remaining entry.
Swap a[i] and a[min]

i 从0开始，找到数组中最小的，与i交换，i++，直到 i 等于数组长度。

Comparable[] a;
int N = a.length;
for(int i = 0; i < N; i++) {
    int min = i;
    for(int j = i + 1; j < N; j++) {
        if(less(a[j], a[min]))
            min = j;
    }
    exch(a, i, min);
}

Both best case and worst case:

$$ \theta(n^2) $$

not stable

Insertion sort

In iteration i, swap a[i] with each larger entry to its left.

int N = a.length;
for(int i = 0; i < N; i++) {
    for(int j = i ; j > 0; j--) {
        if(less(a[j], a[j - 1])) { // a[j] < a[j - 1]
            exch(a, j, j - 1);
        } else {
            break;
        }
    }
}

Best case $\Omega(N)$, worst case $O(N^2)$

stable

Shellsort

Move entries more than one position at a time by h-sorting the array

灵感来自 Insertion sort 对于已部分排序的 Array 效率较高的特点。把 Array 以 h 为间隔进行排序，例如 h = 4 。那么就每隔4个元素用 Insertion sort 排序，然后再按 h = 1 排序。

排序间隔如何确定？

Powers of two 1,2,4,8,16,32,.....      
NO

Powers of two minus one 1,3,7,15,31,63.....
Maybe

3x+1 1,4,13,40,121,363......
OK. Easy to compute

Sedgewick. 1,5,19,41,109,209,505,929,2161,3905......
Good. Tough to beat in empirical sudies.

int N = a.length;
int h = 1;
while (h < N/3) h = 3 * h + 1;

while (h >= 1) {
    for(int i = h; i < N; i++) {
        for(int j = i; j >= h && less(a[j], a[j-h]); j -= h) {
            exch(a, j, j - h);
        }
    }
    h = h / 3;
}

最大的问题是没有一个确定的模型来计算 h 。

Better than $\theta(N^\frac{3}{2})$

not stable

Merge sort

private static void merge(Comparable[] a, Comparable[] aux, int lo, int mid, int hi) {
    assert isSorted(a, lo, mid);
    assert isSorted(aux, mid + 1, hi);

    // Copy
    for(int k = lo; k <= hi; k++) { 
        aux[k] = a[k];
    }

    // Merge
    int i = lo, j = mid + 1;
    for(int k = lo; k <= hi; k++) {
        if (i > mid) a[k] = aux[j++];
        else if (j > hi) a[k] = aux[i++];
        else if (less(aux[j], aux[i])) a[k] = aux[j++];
        else a[k] = aux[i++];
    }
}

private static void sort(Comparable[] a, Comparable[] aux, int lo, int hi) {
    if (hi < lo) return;
    int mid = lo + (hi - lo) / 2;
    sort(a, aux, lo, mid);
    sort(a, aux, mid + 1, hi);
    merge(a, aux, lo, mid, hi);
}

private static void sort(Comparable[] a) {
    aux = new Comparable[a.length];
    sort(a, aux, 0, a.length - 1);
}

Improvements:

private static void sort(Comparable[] a, Comparable[] aux, int lo, int hi) {

    if (hi <= lo + CUTOFF - 1) {
        Insertion.sort(a, lo, hi);
        return;
    }
    int mid = lo + (hi - lo) / 2;
    sort(a, aux, lo, mid);
    sort(a, aux, mid + 1, hi);
    if (!less(a[mid+1], a[mid])) return;
    merge(a, aux, lo, mid, hi);
}

Bottom-up mergesort

public static void sort(Comparable[] a) {
    int N = a.length;
    aux = new Comparable[N];
    for(int sz = 1; sz < N; sz = sz + sz) {
        for(int lo = 0; lo < N - sz; lo += sz + sz) {
            merge(a, lo, lo + sz - 1, Math.min(lo + sz + sz - 1, N - 1));
        }
    }
}

stable

Quick sort

private static int partition(Comparable[] a, int lo, int hi) {
    int i = lo, j = hi + 1;
    while (true) {
        while (less(a[++i], a[lo]))
            if (i == hi) break;

        while (less(a[lo], a[--j]))
            if (j == lo) break; // redundant

        if (i >= j) break;
        exch(a, i, j);
    }

    exch(a, lo, j);
    return j;
}

public class Quick {
    public static void sort(Comparable[] a) {
        StdRandom.shuffle(a);
        sort(a, 0, a.length - 1);
    }

    private static void sort(Comparable[] a, int lo, int hi) {
        if (hi <= lo) return;
        int j = partition(a, lo, hi);
        sort(a, lo, j - 1);
        sort(a, j + 1, hi);
    }
}

Improvements

private static void sort(Comparable[] a, int lo, int hi) {
    if (hi <= lo + CUTOFF - 1) {
        Insertion.sort(a, lo, hi);
        return;
    }

    int m = medianOf3(a, lo, lo + (hi - lo) / 2, hi);
    swap(a, lo, m);

    int j = partition(a, lo, hi);
    sort(a, lo, j - 1);
    sort(a, j + 1, hi);
}

Selection

Find the “top k”

public static Comparable select(Comparable[] a, int k) {
    StdRandom.shuffle(a);
    int lo = 0, hi = a.length - 1;
    while(hi > lo) {
        int j = partition(a, lo, hi);
        if ( j < k) lo = j + 1;
        else if ( j > k) hi = j - 1;
        else return a[k];
    }
    return a[k];
}

Duplicate keys

Dijkstra 3-way quick sort

private static void sort(Comparable[] a, int lo, int hi) {
    if (hi <= lo) return;
    int lt = lo, gt = hi;
    Comparable v = a[lo];
    int i = lo;
    while (i <= gt) {
        int cmp = a[i].compareTo(v);
        if (cmp < 0) exch(a, lt++, i++);
        else if (cmp > 0) exch(a, i, gt--);
        else i++;
    }

    sort(a, lo, lt - 1);
    sort(a, gt + 1, hi);
}

Priority queue

Stack: Remove the item most recently added. Queue: Remove the item least recently added. Randomized queue: Remove a random item. Priority queue: Remove the largest(or smallest) item.

用途：想象有几T的数据，想找其中前1000最大的数，就可以使用MinPQ。逐条数据插入 MinPQ，如果 PQ 里面数据总数大于 1000，执行 delMin 。最后过滤出前1000大的数。

API

public class MaxPQ<Key extends Comparable> MaxPQ() MaxPQ(key[] a) void insert(Key v) key delMax() boolean isEmpty() Key max() int size()

public class UnorderedMaxPQ<Key extends Comparable<Key>> {
    private Key[] pq;
    private int N;

    public UnorderedMaxPQ(int capacity) {
        pq = (Key[]) new Comparable[capacity];
    }

    public boolean isEmpty() {
        return N == 0;
    }

    public void insert(Key x) {
        pq[N++] = x;
    }

    public Key delMax() {
        int max = 0;
        for(int i = 1; i < N; i++) {
            if (less(max, i)) max = i;
        }
        exch(max, N - 1);
        return pq[--N];
    }
}

Binary heap

很好的解决了 Priority queue 的性能问题。

Heap-ordered binary tree

Kyes in nodes
Parent’s key no smaller than children’s keys.

Array representation

Indices start at 1
Take nodes in level order
No explicit links needed

Proposition: Largest key is a[1], which is root of binary tree. Proposition: Can use array indices to move through tree.

Parent of node at k is at k/2
Children of node at k are at 2k and 2k+1

(增加节点时)当子节点比父节点大，需要调整节点:

private void swim(int k) {
    while (k > 1 && less(k/2, k)) {
        exch(k, k/2);
        k = k/2;
    }
}

public void insert(Key x) {
    if (N > pq.length - 1) resize(2*pq.length);

    pq[++N] = x;
    swim(N);
}

(删除节点后)当父节点比子节点小，可以运行sink:

private void sink(int k) {
    while(2 * k < N) {
        int j = 2 * k;
        if (j < N && less(j, j + 1)) j++;
        if(!less(k, j)) break;
        exch(k, j);
        k = j;
    }
}

public Key delMax() {
    Key max = pq[1];
    exch(1, N--);
    sink(1);
    pq[N + 1] = null;
    return max;
}

Heapsort

根据堆的特性，可以用来排序.

Build max heap using bottom-up method.
Repeatedly delete the largest remaining item.

for(int k = N / 2; k >= 1; k--) {
    sink(a, k, N);
}

while(N > 1) {
    exch(a, 1, N--);
    sink(a, 1, N);
}

Selection sort#

Insertion sort#

Shellsort#

Merge sort#

Bottom-up mergesort#

Quick sort#

Selection#

Duplicate keys#

Priority queue#

API#

Binary heap#

Heapsort#