HEAP SORT ALGORITHM

Aditya
5 min readMar 23, 2021

First of let us start with what is a heap.

We visualize the elements of an array in the form of a complete binary tree called a heap. In simple terms, a heap is formed by building a complete binary tree of the items in a heap.

A max / min heap is a complete binary tree in which the value of the parent node is greater (max heap) / smaller (min heap) than its two child nodes. Normally we create a max heap for the heap sorting algorithm. We use the heapify method.

Heapify can be described as maintaining ownership of max heap i.e. Compare the elements of the parent node to the child nodes and the max element would take the place of the parent node. The same applies to subtrees.

Now talking about heap sort. It is a comparison-based sorting algorithm based on the data structure of the heap. It works in a similar way to selection sorting, in which the maximum element is inserted in its ordered position after swapping with the minimum value.

Heapsort was invented by J. Williams in 1964. The birth of the heap already envisioned by Williams as an independent useful data structure.

That same year, R. Floyd released an improved version that could sort an array and continued his earlier research on the tree sorting algorithm.

Sort by selection, Heapsort doesn’t waste time with a linear comparison of the unordered area. Instead, Heapsort makes sure the unordered area is placed in a binary tree data structure to find the largest item faster with each step.

Heapsort is an in-place algorithm but it is not a stable sort.

HOW IT WORKS?

  • Create the heap and make it the maximum heap(Heapify procedure).​
  • Swap the root node from the last node of the heap.
  • Remove the swapped value from the last node of the heap.

Let us see an example :

Array elements: 10, 12, 20, 15, 4
10
/ \
12 20
/ \
15 4

Applying heapify procedure to create max heap:
// 10 has two children 12 and 20. 20 took the place of parent as it is the greatest. 12 is then compared with 15 and 4. 15 took the place of 12 as it is the greatest. 20
/ \
15 10
/ \
12 4
Array elements: 20, 15, 10, 12, 4Now it is transformed into a max heap. We swap the root node with the last node (the greatest and the smallest element) and delete the last node
4
/ \
15 10
/
12
Array elements: 4, 15, 10, 12 |20The heapify procedure calls itself recursively to build heap
in top down manner.
So the same procedure goes on (heapifying the heap and swapping the greatest and smallest element and deleting the last node)The size of the tree is reduced by 1 everytime we do this procedure and finally the elements are sorted.

Working animation works wonders to understand the idea better.

Heap sort example

Psuedo-Code:

Heapsort(A) 
{
BuildHeap(A)
for i = length(A) decrement to 2
{
swap(A[1] , A[i])
heapsize = heapsize -1
Heapify(A, 1)
}
BuildHeap(A)
{
heapsize = length(A)
for i = length/2 decrement to 1
Heapify(A, i)
}

Heapify(A, i)
{
l = left(i)
r = right(i)
if (l <= heapsize) and (A[l] > A[i])
largest = l
else
largest = i
if (r <= heapsize) and (A[r] > A[largest])
largest = r
if (largest != i) {
swap(A[i],A[largest])
Heapify(A, largest)
}
}

Time Complexity and Space Complexity

The heapify process takes the time complexity of O(logn)and it is done n times so time complexity for n heapify procedures is O(nlogn)and the building the heap takes O(n) time complexity.

So the total time complexity is O(n) + O(nlogn) = O(nlogn). Heap sort has O(1) space complexity as it is an in-place algorithm i.e. it does not require any extra space. Elements are rearranged during each recursive call inside the same array.

sorting animation for heapsort

COMPARISON WITH OTHER SORTING TECHINIQUES

Heap sort primarily competes with Quick sort.

Heapsort’s primary advantages are:

  1. It is Simple
  2. Non-recursive code
  3. Minimal auxiliary storage requirement
  4. Dependably great performance

While it can’t show improvement over O(nlogn) for pre-arranged information sources, it doesn’t experience the ill effects of quicksort’s O(n²) assuming the worst possible scenario. Heap sort has poor locality/area of reference.

A very much carried out quicksort is typically quicker than heapsort. In spite of the fact that quicksort requires less examinations.

The primary benefit of quicksort is its vastly improved locality of reference; partitioning results in great spatial region, and the recursive subdivision has great temporal locality. In this manner, quicksort is favored when the extra execution legitimizes the execution exertion.

Heap sort once in a while rivals Merge sort as it is not an in-place sorting algorithm ,but, Merge sort has the benefit when stable arranging is required and if there are pre-sorted inputs.

Heap sort has plainly good locality/area of reference.

BOTTOM-UP HEAP SORT

The bottum-up heapsort is a variant that reduces the number of comparisons required for a significant factor. However, if the comparisons require a function call or other complex logic, the bottom-up heapsort is advantageous.

This is achieved by improving the heapify method.

The change improves the creation phase of the linear time heap building phase, but is more significant in the second phase.

As with normal heapsort, each iteration of the second phase extracts the largest value at the root node of the heap and fills the remaining space. Then, using the smallest node, sift that last item on the heap.

However, this element is from the lowest level of the heap, which means that it is one of the smallest elements in the heap, so sifting will likely take many steps to move around. In normal heapsort, each step of sifting requires two comparisons to find the minimum of three elements: the root node and its two children.

Level of the tree with only one comparison per level, i.e. it finds a leaf that has the property that it and all of its ancestors are greater than or equal to its siblings, and then looks up from that leaf (using a comparison per level ) for the correct position in this path to insert the value of the last node.

This is the same location as normal Heapsort searches and requires the same number of swaps to insert, but fewer comparisons are required to find this location.

Bottom-up heapsort was successfull in beating quicksort with median-of-three pivot selection on arrays of size ≥16000.

OTHER ADVANTAGES

  1. Heap sort is mostly used in embedded systems such as Linux Kernel because of its upper bound on time complexity O(nlogn) and less auxiliary storage of O(1).
  2. Can be effectively used for extracting smallest or largest elements without sorted manner of arrangement for example priority queues.

--

--