Binary heap

Binary heap use an implicit relationship specified by the indices

1.Introduction

  • A heap is just a tree with a special property or constraints on the values of its nodes

  • The tree with special properties or constraints on the value of its nodes these constraint called the heap property the constraint

  • Heap property

    • 1.Minimum heap

      • Every node value should be <= value of it's children

      • The node with the smallest value should be the root of the tree

    • 2.Maximum heap

      • Every node value should be >= value of it's children

      • The node with the largest value should be the root of the tree

  • Shape property

    • If H is the height of the tree - The leaf nodes should only be at level H or H - 1

      • Only the last level is incomplete in the heap

      • All nodes at level H - 1 have to be filled before moving on to level H

    • The heap should form a complete binary tree - all levels except the last should be filled

2. Logic

  • The operations typically performed on a heap requires us to

    • 1.Traverse downwards from the root towards.

    • 2.The leaf node Traverse upwards from the leaf nodes towards the root

  • On the heap we want to be able to

    • 1.Get left child

    • 2.Get right child

    • 3.Get parent

  • Every level of the binary tree in a heap is filled except perhaps the last

  • Implement by array:

    • Node at index i:

      • has a left child at index: 2i + 1

      • has a right child at index: 2i + 2

      • has parent at index: (i - 1)/2

3.Implementation

  • Base class

    • A generic heap can hold data of any type

    • The generic type has to extend comparable - This how we check the highest priority

4.Heapify

  • While inserting or removing an element into the heap how do we know which is the right position for the element to occupy?

Sift down

  • An element is in the wrong position with respect to other elements bellow it in the heap

  • It has to be moved downwards in the heap towards the leaf nodes to find it's right position

Sift up

  • An element is in the wrong position with respect to other elements above it in the heap

  • It has to be moved upwards in the heap towards the leaf nodes to find it's right position

5.Insert

  • 插在Array的最後面, 然後Sift up上來

6.Remove

  1. 移除最小值 (array[0])

  2. 將最大值移到root

  3. Sift down root

7.Complexity

  • Insert: O(log n)

  • Access: O(1)

  • Remove: O(log n)

8.Find the maximum element in a minimum heap

  • 先找出最後一個node

  • 比較所有leaf node的大小 (最大值一定是其中一個leaf)

9.Find the K largest elements in a stream

  • Use a minimum heap with size K to store elements as they come in

10.Merge K sorted arrays using heaps

  • Remove the smallest element from each list and add it to the minimum heap - there will be K elements in the heap

11.Find the median elements in a stream

Tips

  • The stream implies that we have no idea which elements are going in to come in text - we have to keep track of the median as the elements stream in

  • Use a max heap to store the smaller (first) half of elements seen in the stream

  • Use a min heap to store the larger (second) half of elements seen in the stream

  • 原則上, min heap及max heap的root都是stream的median elements

  • min heap的node值都小於平均值, max heap的node值都大於平均值

Implementation

  1. 第一個數字放到max heap

  2. 接下來的數字依序跟max heap比較, 如果比max heap的root大則放到min heap, 如果比max heap的root小則放到max heap

  3. 如果max heap及min heap個數差異超過1, 則必須rebalance: 從min heap中取出最小值並且加入max heap

  4. 如果兩個heap的數量相等, 則平均值為兩個heap root的平均值

  5. 否則平均值為數量多的heap的root值

Last updated