Bubble sort has O(n2) execution time. There are n(n-1)/2 comparisons in every case. In the best case, the array is already sorted and there are no swaps. In the worst case, the array is reverse sorted and there is a swap for every compare. In the average case, there is 1/2 swap per compare, or n(n-1)/4 swaps.
The sort statistics always show n(n-1)/2 compares. Each swap counts as 2 moves, so these numbers are about the same on average.
The Show major steps checkbox shows each pass. The show swaps checkbox shows each comparison that results in a swap. Note that array bounds start at 0 in all examples.
For all sorting algorithms, compares refers to comparisons between array elements and not to other variables. Moves refers to assignments to the array and not to any other assignments. Execution time is wall clock time, and does not include the time to read in the array or write it out. However, it does include waiting for the user while single stepping.
There are many ways to partition the array. The technique used here is from Jon L. Bentley (Programming Pearls, Communications of the ACM, Apr. 1984, p. 287). We use the first element as the pivot, then scan the array, keeping a count of elements smaller than the pivot. Each time we find one, we swap it with the first large element from the left. Finally, we swap the last small element with the pivot, putting it in the middle.
Quicksort is O(n log n) in the average case and O(n2) in the worst case. The worst case is when we choose the pivot poorly, so that it's the largest or smallest element in the array. This happens when the input is already sorted or reverse sorted. Try doing two quicksorts, and notice how the second one is slower.
Stepping through the sort shows the pivot and each pair of recursive calls after ordering the data around the pivot. If one of the array segment is size 1 (such as A[0..0]) or size 0 (A[0..-1]), the call returns immediately.
Merge sort is O(n log n) in all cases. There are no swaps, so checking the box has no effect. Only assignments into the original array (and not the temporary) are counted as moves. Stepping through the sort shows each recursive call.
In this program, the numbers are radix sorted in base 2. The first pass copies each even number into one array and each odd number into another. Then the two arrays are copied (even numbers first) back to the original. The next pass works on bit 1, then bit 2, and so on. The program detects when no more bits need to be sorted on and stops. The program uses 32 bit signed integers, so it only works on numbers in the range 0 to 231 - 1.
Radix sort is O(nd). There are nd assignments back to the array and no comparisons between elements (and no swaps). Note how larger numbers take longer to sort, unlike the other algorithms.
xi = xi-24 XOR xi-55which has a cycle length of 255 - 1. The first 55 values are seeded with a pseudo-random sequence:
xi = floor(1012 sin(i)) mod 231The program selects a random number in the range 0..n-1 by
xi mod nwhere n is the number selected in the random number range box.
The Reset Seq button resets the sequence back to the starting point of x56.
Execution times were obtained using Netscape 2.0 Javascript running under Windows 3.1 on a U.S. Logic DX4/100 (100 Mhz 80486) with 8 MB RAM. All tests were performed with the same sequence of random numbers from 0 to 999.
Compares (between elements of A) | |||||||
---|---|---|---|---|---|---|---|
n | 10 | 20 | 30 | 40 | 50 | 100 | 200 |
Bubble Sort | 45 | 190 | 435 | 780 | 1225 | 4950 | * |
Quick Sort | 29 | 74 | 114 | 212 | 250 | 600 | 1645 |
Merge Sort | 21 | 62 | 104 | 162 | 219 | 543 | 1278 |
Radix Sort | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Moves (assignments to A) | |||||||
n | 10 | 20 | 30 | 40 | 50 | 100 | 200 |
Bubble Sort | 28 | 212 | 406 | 878 | 1202 | 5130 | * |
Quick Sort | 10 | 52 | 86 | 178 | 254 | 478 | 1474 |
Merge Sort | 34 | 88 | 148 | 216 | 286 | 672 | 1544 |
Radix Sort | 100 | 200 | 300 | 400 | 500 | 1000 | 2000 |
Execution time (seconds) | |||||||
n | 10 | 20 | 30 | 40 | 50 | 100 | 200 |
Bubble Sort | 0.33 | 1.70 | 3.57 | 7.04 | 10.22 | 42.9 | * |
Quick Sort | 0.22 | 0.55 | 0.88 | 1.65 | 2.14 | 4.50 | 13.13 |
Merge Sort | 0.33 | 0.88 | 1.37 | 1.97 | 2.74 | 6.37 | 14.78 |
Radix Sort | 0.38 | 0.71 | 0.99 | 1.38 | 1.70 | 3.29 | 6.64 |
The next table shows the effects of sorting an already sorted array. Note how the bubble sort is twice as fast, since it does no swaps. Quicksort is slowed considerably, since the pivot is now chosen poorly, so that each n-element slice is recursively subdivided into arrays of length 0 and n - 1.
Random input | Sorted input | |||||
---|---|---|---|---|---|---|
n = 100 | Compares | Moves | Time | Compares | Moves | Time |
Bubble Sort | 4950 | 5130 | 43.23 | 4950 | 0 | 22.30 |
Quick Sort | 600 | 478 | 4.51 | 4950 | 0 | 19.88 |
Merge Sort | 543 | 672 | 6.43 | 316 | 672 | 5.54 |
Radix Sort | 0 | 1000 | 3.29 | 0 | 1000 | 3.35 |
The last table shows how the radix sort slows down by a factor of 2 when the range of input numbers is increased from 1000 to 1,000,000.
Random, range 0-999 | Random, range 0-999999 | |||||
---|---|---|---|---|---|---|
n = 100 | Compares | Moves | Time | Compares | Moves | Time |
Bubble Sort | 4950 | 5130 | 43.23 | 4950 | 4810 | 42.35 |
Quick Sort | 600 | 478 | 4.51 | 846 | 562 | 5.93 |
Merge Sort | 543 | 672 | 6.43 | 538 | 672 | 6.38 |
Radix Sort | 0 | 1000 | 3.29 | 0 | 2000 | 6.92 |
Timing was done by hand with a stopwatch (with an attempt to compensate for reflex delays), so should only be considered accurate to about 0.3-0.5 seconds. The effect of disk-I/O was estimated at 0.1 seconds by repeating the sort commands and timing the difference. The machine caches disk to memory using SMARTDRV 4.0 (read caching only).
The random numbers were generated using the same algorithm as in the Javascript examples. (See source code and sample output). Unfortunately, the sequence differs after the 8'th number, probably due to rounding differences in the sin() function.
n | Range | Time (random) | Time (presorted) |
---|---|---|---|
1000 | 0-999 | 1.1 | 0.7 |
2000 | 0-999 | 3.6 | 1.8 |
3000 | 0-999 | 8.2 | 4.3 |
4000 | 0-999 | 15.2 | 7.6 |
5000 | 0-999 | 24.0 | 12.0 |
10000 | 0-999 | 98.7 | 47.1 |
5000 | 0-0 | 10.3 | |
5000 | 0-1 | 15.9 | |
5000 | 0-9 | 17.2 | |
5000 | 0-10 | 23.0 | |
5000 | 0-99 | 20.1 | 11.2 | 5000 | 0-100 | 27.7 | 15.6 |
5000 | 0-999 | 24.0 | |
5000 | 0-1000 | 30.2 | |
5000 | 0-9999 | 26.2 | |
5000 | 0-10000 | 34.2 | |
5000 | 0-99999 | 30.5 |
The results clearly show an O(n2) sort. If the input is already sorted, the sort is about twice as fast, indicating that the sort is O(n2) in both comparisons and moves. The bubble sort shows this characteristic, but so do the insertion sort and selection sort, so we cannot draw any conclusions about the type of sort.
It is possible to write a selection sort that is O(n) in moves, either by making the sort unstable or by using an auxiliary array. The SORT command must be stable because it is capable of sorting on any column. Furthermore, the SORT command is limited to input files smaller than 64K bytes.
The size of the input file is n(d+2), where d is the number of digits needed to represent the largest random number. There are 2 characters at the end of each line (CR and LF). Increasing the number of digits by 1 increases the sort time for n=5000 by 3-4 seconds. Assuming n2/2 moves, each byte takes 0.25-0.3 microseconds to move.
Increasing the range without increasing the number of digits decreases the sorting time (by 3-4 seconds for n=5000) because on average, one less character needs to be compared on each line. With a range of 0-100, 0-1000, etc., most lines start with a blank, so at least 2 characters need to be compared on each line. If the range is 0-99, 0-999, etc., 90% of string comparisons can stop after one digit. Thus, each byte comparison also takes 0.25-0.3 microseconds.
Finally, it should be noted that inspite of the SORT command's limitations, it is 4000 times faster than the Javascript bubble sort. Furthermore, SORT has 100 times the capacity, using 64K of RAM, than the Javascript example running with 8 MB of RAM and an additional 6 MB of virtual memory.
Matt Mahoney, matmahoney@aol.com,