Growing with the Web

Binary heap: Build heap proof

Published , updated
Tags:

The binary heap data structure supports a build heap operation that runs in O(n). Intuitively it might seem that it should run in O(n \log n) time since it performs an O(\log n) operation n/2 times. This article provides a proof of the linear run time.

This article assumes knowledge of the binary heap data structure.

Binary heap build heap operation example, your browser doesn't support SVG.
The build heap operation

The basic idea behind why the time is linear is due to the fact that the time complexity of heapify depends on where it is within the heap. It takes O(1) time when the node is a leaf node (which makes up at least half of the nodes) and O(\log n) time when it’s at the root.

The O(n) time can be proven by solving the following:

\sum_{h=0}^{\lfloor\log n\rfloor} \lceil\dfrac{n}{2^{h + 1}} \rceil O(h)

This can be broken down into three parts:

  • O(h) is worst case time in which heapify takes for each node
  • \lceil\dfrac{n}{2^{h + 1}} \rceil is the number of nodes for any given height n
  • \sum_{h=0}^{\lfloor\log n\rfloor} is a summation that sums the values for each height of the tree from 0 to \lfloor \log n \rfloor.

n can be moved out of the fraction since it’s a constant and h can be moved into the fraction.

= O(n \cdot \sum_{h=0}^{\lfloor\log n\rfloor} \dfrac{h}{2^{h + 1}})

\dfrac{h}{2^{h + 1}} can be simplified to \dfrac{h}{2^h2} and then the 2 can be dropped completely since constants are discarded within big-O notation.

= O(n \cdot \sum_{h=0}^{\lfloor\log n\rfloor} \dfrac{h}{2^h})

We’re interested in how the summation behaves as it approaches infinity.

= O(n \cdot \sum_{h=0}^{\infty} \dfrac{h}{2^h})

The summation now converges to 2 so it can be swapped in and then dropped, leaving O(n).

= O(n \cdot 2) = O(n)

Comments

comments powered by Disqus
Like this article?
Subscribe for more!