David M. Howard dmh2000@cwix.com
Sun Microsystems Certified Java Developer
Sorting Algorithms
This applet lets you observe the dynamic operation of several basic sort algorithms. The implementations,
explanations
and display technique are all taken from Algorithms in C++, third Edition, Sedgewick, 1998.
- Generate a data set by clicking one of the data set buttons.
- This generates a data set with the appropriate characteristics. Each data set is an array of 512 values in
the range 0..511. Data sets have no duplicate entries except the Gaussian distribution.
- Run one or more sort algorithms on the data set to see how the algorithm works.
- Each execution of a sort runs on the same data set until you generate a new one.
|
Data Set Types
- Click "Random" to get an evenly distributed random data set with no duplications.
- Click "Ordered" to get a data set in ascending order with no duplications
- Click "Reverse" to get a data set in descending order with no duplications
- Click "Gaussian" to get a gaussian distribution with duplicate elements
Sort Algorithms
- Click "Bubble" to run the bubble sort algorithm.
- Click "Shell" to run the shellsort algorithm.
- Click "QSort" to run the quicksort algorithm.
- Click "Merge" to run the mergesort algorithm.
- Click "Heap" to run the heapsort algorithm.
|
- On the graph, the x axis is the domain of possible values 'i' and the y axis is the value of the array at index
'i'. When a complete data set is sorted all values will be on the diagonal, i == a[i]. The more spread out the
points are the farther from being sorted the array is. For example, the value i=1 may start out in a[511] and it
will end up in a[1] when the array is fully sorted. The Gaussian data set has duplicates so it ends up as a curve
instead of a line.
- The display shows a number of 'Ops' which is an approximation of the total number of comparisons and item swaps
that occurred during the sort. This value corresponds roughly to the complexity of the algorithm expressed in big-O
notation. It doesn't reflect the implementation details that also contribute to algorithm performance. Different
data sets will yield different performance for a given algorithm.
- An important characteristic of sort algorithms that has a large effect on their efficiency is how far elements
are moved in a swap. Bubble sort moves elements only a short distance and the array crawls into sorted order. Better
algorithms move elements farther in a single operation so that the array seems to jump into sorted order. The display
technique used here really shows this effect.
- This applet redraws the array at various key points in the algorithm with a time delay inserted to make it
run slow enough to watch. Because the loop/recursion structure of the various algorithms can be different, where
the repaints occur and because the repaint and time delay dominate the running of the applet, the clock time to
completion of a particular algorithm doesn't reflect the relative performance of the algorithm. Use the 'Ops' figure
for comparisons.
- For a comprehensive overview of sorting algorithms and algorithm complexity see the Sedgewick book. The implementations
used here are the basic ones from his book. There are plenty of optimizations that can be done to improve these
algorithms.
Basic Algorithm Characteristics
The analysis of sort algorithms is complex and has a lot of variables, but in general the algorithms used here
should come out something like this:
- Bubble sort is O(N**2). The number of Ops should come out <= 512 * 512 = 262144
- Quicksort is O(2N lg N) on the average but can degenerate to (N**2)/2 in the worst case (try the ordered data
set on quicksort). Quicksort is recursive and needs a lot of stack space.
- Shell sort (named for Mr. Shell) is less than O(N**(4/3)) for this implementation. Shell sort is iterative
and doesn't require much extra memory.
- Merge sort is O( N lg N) for all data sets, so while it is slower than the best case for quicksort, it doesn't
have degenerate cases. It needs additional storage equal to the size of the input array and it is recursive so
it needs stack space.
- Heap sort is guaranteed to be O(N lg N), doesn't degenerate like quicksort and doesn't use extra memory like
mergesort, but its implementation has more operations so on average its not as good as quicksort.
Source Code
David M. Howard, October 1998
Home