Heap Sort

From CS315

Jump to: navigation, search

Contents

The Facts

  • Heap Sort is O(n log n) in time like other fast sorts.
  • It has no best or worst case.
  • The most attractive feature of Heap Sort is that it uses constant space.
  • Heap Sort is not stable.

The Machinery

Representing a Binary Heap

Heap Sort employs a binary heap. That is, a binary tree which has the quality that each node bears some relationship to its parent node. In a 'max' binary heap the root node is the node containing the largest value, and each child is less than its parent. Similarly, in a 'min' binary heap the root node is the smallest and each child is greater than its parent. In this context we will refer only to 'max' binary heaps for the purpose of examples. A binary heap may be translated from tree to array form using the following equations:

p = (c - 1)/2
c1 = 2p + 1
c2 = c1 + 1

where p, c1 and c2 represent the indices of the parent, left child and right child, respectively.

Making a Binary Heap

There are two efficient algorithms that may be utilized to create a binary heap from an unsorted array. The first algorithm (here called upHeap) assumes that the first n-1 items of the array are heaped (in heap form) and makes a heap of size n by inserting the nth item appropriately. The second algorithm (here called downHeap) assumes that the last n-1 items are heaped and makes a heap of size n by performing a series of successive swaps of the out of place item with the larger of its two children.A heap may be formed from 'unheaped' data with successive applications of either of these two algorithms.

Most implementations of makeHeap (the algorithm that heaps unordered data) utilize downHeap because it is only necessary to call downHeap n/2 times (as opposed to making n calls to upHeap) in order to get a heaped array. This is a result of the fact that 1/2 of the items in a heap invariably have 0 children, and therefore nothing useful can be inferred about those items' relationship to each other. Because downHeap is applied on subarrays starting at the end of the array to be heaped (upHeap starts at the beginning) and because modifying the order of the last n/2 items with respect to each other is essentially a wasted effort, downHeap need only be applied n/2 times to result in a heap of size n.

Pushing and Popping a Heap

Commonly, it is desirable to add items to or retrieve items from a binary heap. These functions are traditionally known as pushing and popping the heap respectively. Pushing the heap takes as a precondition an array whose first n-1 items are heaped and a last item about which nothing is known and returns a heap, thus successfully adding an item to the heap. Popping the heap takes as a precondition an array of heaped data of size n and returns an array containing a heap of size n-1 followed by the max item in the array. This function is accomplished by swapping the first (and thus max) item with the last item about which nothing is known, and then making one call to downHeap (on the first n-1 items in the array) which will restore the heap quality of the first n-1 items.

Heap Sort

Finally, we have the background to talk about how heap sort itself works. Heap sort begins by calling makeHeap on the array of unsorted data. This yields an array with the property that the first item is the max item. Now, a series of calls to popHeap on successively shorter subarrays will yield a sorted array. First, popHeap is called on the entire array of n items. This yields an array whose first n-1 items are heaped and whose last nth item is the max item. Next, popHeap is called on the subarray consisting of the first n-1 items, which puts the second largest item in the next to last position of the array. A third call to popHeap on the first n-2 items in the array will leave the third largest item in the third to last position. And so on.


Written by Amelia Harrison

Personal tools