计数排序

计数排序
计数排序
概况
類別	排序算法
資料結構	数组
复杂度
平均時間複雜度
最坏时间复杂度
最优时间复杂度
空間複雜度
相关变量的定义

计数排序（英語：Counting sort）是一种稳定的线性时间排序算法。该算法于1954年由哈羅德·H·西華德提出。计数排序使用一个额外的数组 $C$ ，其中第i个元素是待排序数组 $A$ 中值等于 $i$ 的元素的个数。然后根据数组 $C$ 来将 $A$ 中的元素排到正确的位置。

计数排序的特征

当输入的元素是 $n$ 个 $0$ 到 $k$ 之间的整数时，它的运行时间是 $\Theta (n+k)$ 。计数排序不是比较排序，因此不被 $\Omega (n\log n)$ 的下界限制。

由于用来计数的数组 $C$ 的长度取决于待排序数组中数据的范围（等于待排序数组的最大值与最小值的差加上1），这使得计数排序对于数据范围很大的数组，需要大量时间和内存。例如：计数排序是用来排序0到100之间的数字的最好的算法，但是它不适合按字母顺序排序人名。但是，计数排序可以用在基数排序算法中，能够更有效的排序数据范围很大的数组。

通俗地理解，例如有10个年龄不同的人，统计出有8个人的年龄比A小，那A的年龄就排在第9位，用这个方法可以得到其他每个人的位置，也就排好了序。当然，年龄有重复时需要特殊处理（保证稳定性），这就是为什么最后要反向填充目标数组，以及将每个数字的统计减去1。算法的步骤如下：

找出待排序的数组中最大和最小的元素
统计数组中每个值为 $i$ 的元素出现的次数，存入数组 $C$ 的第 $i-minValue$ 项
对所有的计数累加（从 $C$ 中的第一个元素开始，每一项和前一项相加）
反向填充目标数组：将每个元素 $i$ 放在新数组的第 $C[i]$ 项，每放一个元素就将 $C[i]$ 减去1

Java語言的實现

public class CountingSort {
    public static void main(String[] args) {
        int[] A = CountingSort.countingSort(new int[]{16, 4, 10, 14, 7, 9, 3, 2, 8, 1});
        Utils.print(A);
    }

    public static int[] countingSort(int[] A) {
        int[] B = new int[A.length];
        // 假设A中的数据a'有，0<=a' && a' < k并且k=100
        int k = 100;
        countingSort(A, B, k);
        return B;
    }

    private static void countingSort(int[] A, int[] B, int k) {
        int[] C = new int[k];
        // 计数
        for (int j = 0; j < A.length; j++) {
            int a = A[j];
            C[a] += 1;
        }
        Utils.print(C);
        // 求计数和
        for (int i = 1; i < k; i++) {
            C[i] = C[i] + C[i - 1];
        }
        Utils.print(C);
        // 整理
        for (int j = A.length - 1; j >= 0; j--) {
            int a = A[j];
            B[C[a] - 1] = a;
            C[a] -= 1;
        }
    }
}


//针对c数组的大小，优化过的计数排序
public class CountSort{
	public static void main(String []args){
		//排序的数组
		int a[] = {100, 93, 97, 92, 96, 99, 92, 89, 93, 97, 90, 94, 92, 95};
		int b[] = countSort(a);
		for(int i : b){
			System.out.print(i + "  ");
		}
		System.out.println();
	}
	public static int[] countSort(int []a){
		int b[] = new int[a.length];
		int max = a[0], min = a[0];
		for(int i : a){
			if(i > max){
				max = i;
			}
			if(i < min){
				min = i;
			}
		}
		//这里k的大小是要排序的数组中，元素大小的极值差+1
		int k = max - min + 1;
		int c[] = new int[k];
		for(int i = 0; i < a.length; ++i){
			c[a[i]-min] += 1;//优化过的地方，减小了数组c的大小
		}
		for(int i = 1; i < c.length; ++i){
			c[i] = c[i] + c[i-1];
		}
		for(int i = a.length-1; i >= 0; --i){
			b[--c[a[i]-min]] = a[i];//按存取的方式取出c的元素
		}
		return b;
	}
}


//另一種參考，
//缺點:不適合數據落差大、浮點數，數據落差大會生成大a數組，數少用其他演算更好。
//優點:線性快，固定重複巨量數據適用，沒有更快，理論，只統計，不做多餘運算或搬移。
import java.util.Arrays;
import java.util.Random;
public class Csoft {
    public static void main(String[] args) {
        // int arr[] = { -535000000, 0, -372, -299,830000};  
		// int arr[] = {100, 93, 97, 92, 96, 99, 92, 89, 93, 97, 90, 94, 0, -1,-1,-95};
        // int arr[] = Random_Numbers(500000000);          
        int arr[] = Random_Numbers(20);          
        System.out.println(Arrays.toString(arr));
        new Csoft(arr);
        System.out.println(Arrays.toString(arr));
    }

    private int min;
    Csoft(){}
    Csoft(int[] arr) {
            b(arr);
    }
    public static int[] b(int[] arr) {
        int max = 0, min = 0;
        for (int i = 0; i < arr.length; i++) {
            max = arr[i] > arr[max] ? i : max;
            min = arr[i] < arr[min] ? i : min;
        }
        int k = -arr[min]; //k=基數
        max = arr[max] + 1;
        int[] a = new int[max + k];
        for (int i = 0; i < arr.length; i++) {
            int o = arr[i] + k;
            a[o]++;
        }
        int t = 0;
        for (int j = 0; j < a.length; j++) {
            if (a[j] > 0) {
                for (int i = 0; i < a[j]; i++) {
                    arr[t] = j - k;
                    t++;
                }
            }
        }
        return arr;
    }    
    public static int[] Random_Numbers(int num) { //亂數負數
        Random r = new Random();
        int[] c = new int[num];
        for (int i = 0; i < num; i++) {
            c[i] = r.nextInt(1000) - 500;
        }
        return c;
    }    
}

C語言的實现

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void print_arr(const int *arr, const int n) {
	printf("%d", arr[0]);
	for (int i = 1; i < n; i++)
		printf(" %d", arr[i]);
	printf("\n");
}

void counting_sort(const int *ini_arr, int *sorted_arr, const int n, const int max_val) {
	int *count_arr = (int *) calloc(max_val, sizeof(int));
	for (int i = 0; i < n; i++)
		count_arr[ini_arr[i]]++;
	for (int i = 1; i < max_val; i++)
		count_arr[i] += count_arr[i - 1];
	for (int i = n; i > 0; i--)
		sorted_arr[--count_arr[ini_arr[i - 1]]] = ini_arr[i - 1];
	free(count_arr);
}

int main(int argc, char **argv) {
	int n = 10;
	int max_val = 100;
	int *arr = (int *) calloc(n, sizeof(int));
	int *sorted_arr = (int *) calloc(n, sizeof(int));
	srand(time(0));
	for (int i = 0; i < n; i++)
		arr[i] = rand() % max_val;
	printf("ini_array: ");
	print_arr(arr, n);
	counting_sort(arr, sorted_arr, n, max_val);
	printf("sorted_array: ");
	print_arr(sorted_arr, n);
	free(arr);
	free(sorted_arr);
	return 0;
}

javascript实现

Array.prototype.countSort = function() {
  const C = []
  for(let i = 0; i < this.length; i++) {
    const j = this[i]
    C[j] >= 1 ? C[j] ++ : (C[j] = 1)
  }
  const D = []
  for(let j = 0; j < C.length; j++) {
    if(C[j]) {
      while(C[j] > 0) {
        D.push(j)
        C[j]--
      }
    }
  }
  return D
}
const arr = [11, 9, 6, 8, 1, 3, 5, 1, 1, 0, 100]
console.log(arr.countSort())

[1]

Golang的实现

  func countingSort(arr []int, minvalue, maxValue int) []int {
      bucketLen := maxValue - minvalue + 1
      bucket := make([]int, bucketLen)
      for _, v := range arr {
          bucket[v-minvalue]++
      }
      result := make([]int, len(arr))
      index := 0
      for k, v := range bucket {
          kk := k + minvalue
          for j := v; j > 0; j-- {
              result[index] = kk
              index++
          }
      }
      return result

  }

Python3的实现

# -*- coding: utf-8 -*-

def count_sort(a: list) -> list:
    a_min: int = min(a)
    k: int = max(a) - a_min + 1
    c: list = [0 for _ in range(k)]

    for i in a:
        c[i - a_min] += 1

    for i, v in enumerate(c):
        if i == 0:
            continue
        c[i] = v + c[i-1]

    result: list = [None for _ in range(len(a))]
    for i in a:
        result[c[i - a_min] - 1] = i
        c[i - a_min] -= 1

    return result


if __name__ == '__main__':
    print(count_sort([652, 721, 177, 977, 24, 17, 126, 515, 442, 917]))

注解

参考资料

Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 算法导论，第二版. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. 第8.2章：计数算法, pp. 168–170.
高德纳。计算机程序设计艺术，第三卷：排序和查找，第二版. Addison-Wesley, 1998. ISBN 0-201-89685-0. Section 5.2, Sorting by counting, pp. 75–80.
Seward, Harold H. Information sorting in the application of electronic digital computers to business operations Masters thesis. MIT 1954.
http://www.nist.gov/dads/HTML/countingsort.html（页面存档备份，存于互联网档案馆） NIST's Dictionary of Algorithms and Data Structures: counting sort]

查论编排序算法
理论	計算複雜性理論大O符号全序关系数据结构术语列表原地算法稳定性比较排序自适应排序（英语：Adaptive sort）排序网络（英语：Sorting network）整数排序（英语：Integer sorting） X+Y排序（英语：X + Y sorting）量子排序（英语：Quantum sort）
交换排序	冒泡排序鸡尾酒排序奇偶排序梳排序侏儒排序快速排序慢速排序臭皮匠排序 Bogo排序
选择排序	选择排序堆排序平滑排序（英语：Smoothsort）笛卡尔树排序（英语：Cartesian tree sort）锦标赛排序（英语：Tournament sort）圈排序（英语：Cycle sort）弱堆排序（英语：Weak heap）
插入排序	插入排序希尔排序伸展排序二叉查找树排序图书馆排序耐心排序
归并排序	归并排序梯级归并排序（英语：Cascade merge sort）振荡归并排序（英语：Oscillating merge sort）多相归并排序（英语：Polyphase merge sort）
分布排序	美国旗帜排序（英语：American flag sort）珠排序桶排序爆炸排序（英语：Burstsort）计数排序比較計數排序插值排序鸽巢排序相邻图排序（英语：Proxmap sort）基数排序闪电排序（英语：Flashsort）
并发排序	双调排序器（英语：Bitonic sorter） Batcher归并网络两两排序网络（英语：Pairwise sorting network）
混合排序	塊排序（英语：Block sort） Tim排序内省排序 Spread排序（英语：Spreadsort）归并插入排序（英语：Merge-insertion sort）
其他	拓撲排序煎餅排序意粉排序（英语：Spaghetti sort）