Week 09


Kruskal's Algorithm1/45

One approach to computing MST for graph G(V,E):

MSTree KruskalMST(Graph g)
{
   MST = empty graph (nV(g))
   sortedEdgeList = sort edges(g) by weight
   foreach (edge e in sortedEdgeList) {
      add e to MST
      if (MST has a cyle) remove e from MST
      if (MST has nV(g) vertices) break
   }
   return MST
}

Critical operations:  sorting edges O(ElogE),  checking for cycles O(EV)


Reminder: MST = subgraph, with all vertices, and minimum sum-of-edge-weights


Prim's Algorithm2/45

Another approach to computing MST for graph G(V,E):

MSTree PrimMST(Graph g)
{
   MST = empty graph (nV(g))
   usedV = {0};  unusedE = edges(g)
   while (size(usedV) < nV(g)) {
      find (edge e in unusedE) satisfying {
         e = (s,t,w), s ∈ usedV, t ∉ usedV,
         w is min weight of all such edges
      }
      add e = (s,t,w) to MST
      usedV += t;  unusedE -= e
   }
   return MST
}

Critical operation:  finding best edge


... Prim's Algorithm3/45

Rough time complexity analysis ...

Notes:


Priority Queues (sidetrack)


Priority Queues5/45

Some applications of queues require

Priority Queues (PQueues or PQs) provide this via: Plus generic ADT operations: new, dispose, isEmpty, ...


Priority order may involve "weight" based on other factors than just key


... Priority Queues6/45

Behaviour of priority queue ...

[Diagram:Pics/linear/pqueue0-small.png]


... Priority Queues7/45

Priority Queue interface:

// Types
... PQueue ... //priority queue
... Item ... //items in queue
... Key ...  //priority values in items
// Operations
PQueue newPQueue(int Size);
void PQueueJoin(PQueue q, Item it);
Item PQueueLeave(PQueue q);
int  PQueueIsEmpty(PQueue q);


... Priority Queues8/45

Priority Queue representations:

Data StructureSpaceJoinLeaveIsEmpty
Sorted ArrayMaxNO(N)O(1)O(1)
Unsorted ArrayMaxNO(1)O(N)O(1)
Sorted ListO(N)O(N)O(1)O(1)
Unsorted ListO(N)O(1)O(N)O(1)
Heap (see later)O(N)O(logN)O(logN)O(1)

for a PQueue containing N items


Shortest Path


Shortest Path10/45

Path = sequence of edges in graph G

With unweighted edges, cost(path) = length (#edges)

With weighted edges, cost(path) = sum of edge weights  (aka weight(path)

Shortest path between vertices s and t

Assumptions: weighted digraph, no negative weights.

Variations: source-target, single-source, all-pairs

Applications:  robot navigation,  routing in data networks


Single-source Shortest Path11/45

Given: weighted digraph G, source vertex s

Result: shortest paths from s to all other vertices

Example:

[Diagram:Pics/wgraphs/sssp-small.png]


Edge Relaxation12/45

Assume: dist[] and pred[] as above
  (but containing data for shortest paths discovered so far)

[Diagram:Pics/wgraphs/relaxation-small.png]

Relaxation updates data for w if we find a shorter path from s to w.


... Edge Relaxation13/45

Relaxation along edge e from v to w

Implementation of edge relaxation:

if (dist[v] + e.weight < dist[w]) {
   dist[w] = dist[v] + e.weight;
   pred[w] = v;
}


Dijkstra's Algorithm14/45

(* the above algorithm is from the book "Data Structures and Algorithms in Java", 6th Edition, By: Michael T. Goodrich; Roberto Tamassia, John Wiley & Sons)


Example-1 (Dijkstra's Algorithm)15/45





Example-2* (Dijkstra's Algorithm)16/45

Find shortest paths (using Dijkstra's Algorithm) for the following graph, start vertex is BWI.

 
Predecessor List
(a dot (.) means no entry exists on the predecessor list)
Action  Cloud  Priority Queue JFK  MIA  BOS   DFW PVD  SFO  LAX  ORD 
 { } [  (0, BWI), (inf, JFK), (inf,ORD), (inf, MIA),  (inf, DFW), (inf, BOS), (inf, PVD),  (inf, SFO), (inf, LAX) ] . . . . . . . .
pull BWI into the cloud  + BWI (0) [  (184, JFK), (621,ORD), (946, MIA),  (inf, DFW), (inf, BOS), (inf, PVD),  (inf, SFO), (inf, LAX) ] BWI BWI . . . . . BWI
pull JFK into the cloud  +  JFK (184) [ (328, PVD), (371, BOS), (621, ORD), (946, MIA), (1575, DFW),   (inf, SFO), (inf, LAX) ] BWI BWI JFK JFK JFK . . BWI
pull PVD + PVD (328) [  (371, BOS), (621, ORD), (946, MIA), (1575, DFW),  (inf, SFO), (inf, LAX) ] BWI BWI JFK JFK JFK . . BWI
pull BOS +  BOS (371) [  (621, ORD), (946, MIA), (1575, DFW),  (3075, SFO), (inf, LAX) ] BWI BWI JFK JFK JFK BOS . BWI
pull ORD, 
update DFW and SFO
+ ORD (621) (946, MIA), (1423, DFW),  (2467, SFO), (inf, LAX) ] BWI BWI JFK ORD JFK ORD . BWI
pull MIA +  MIA (946)   (1423, DFW),  (2467, SFO), (3288, LAX) ] BWI BWI JFK ORD JFK ORD MIA BWI
pull DFW, 
update LAX
+ DFW (1423) [  (2467, SFO), (2658, LAX) ] BWI BWI JFK ORD JFK ORD DFW BWI
pull SFO + SFO (2467) (2658, LAX) ] BWI BWI JFK ORD JFK ORD DFW BWI
pull LAX + LAX (2658) [  ] BWI BWI JFK ORD JFK ORD DFW BWI
 

To recover the path itself, trace back through the Predecessor table. E.g., the shortest path to LAX goes through (in reverse order) LAX, DFW, ORD, BWI.

(* the above example is from the book "Data Structures and Algorithms in Java", 6th Edition, By: Michael T. Goodrich; Roberto Tamassia, John Wiley & Sons)
 


Exercise 1: Tracing Dijkstra's Algorithm17/45

Show how Dijkstra's SSSP algorithm runs on:

[Diagram:Pics/wgraphs/sp-example-small.png]


But maybe better to watch Steven Ha's Visualisations  (http://visualgo.net/sssp.html)


Dijkstra's Algorithm (C implementation)18/45

What's needed for real implementation?

#define MAX_WT  Value larger than any real weight
#define NO_EDGE Value in adj matrix to indicate no edge

// Priority Queue interface
// use dist[] array to determine PQ priorities
PQueue newPQ(float dist[], int nV);
// add vertex to priority queue
void join(PQueue, Vertex);
// remove vertex with smallest dist[]
Vertex leave(PQueue, Vertex);
// reorder queue based on change to vertex
void reorder(PQueue, Vertex);
// check whether queue is empty
int empty(PQueue);
// clean up queue data
void disposePQ(PQueue);


... Dijkstra's Algorithm (C implementation)19/45

C implementation (via Sedgewick)

void shortestPath(Graph g, Vertex start,
                   Vertex pred[], float dist[])
{
   PQueue pq = newPQ(dist,nV(g));
   for (Vertex v = 0; v < nV(g); v++)
      { pred[v] = -1; dist[v] = MAX_WT; join(pq,v); }
   dist[start] = 0.0;  reorder(pq,start);
   while (!empty(pq)) {
      Vertex s = leave(pq);
      for (Vertex t = 0; t < nV(g); t++) {
         float len = g->adj[s][t];
         if (len == NO_EDGE) continue;
         if (dist[s]+len < dist[t]) {
            pred[t] = s;
            dist[t] = dist[s]+len;
            reorder(pq,t);
         }
      }
   }
   disposePQ(pq);
}


... Dijkstra's Algorithm (C implementation)20/45

Rough time complexity analysis ...

Outer loop has O(V) iterations, and PQ updates are O(logV).

Implementing "find (edge e=(s,t,w)) satisfying ..."

  1. try all e ∈ E ... cost = O(VE)
  2. classical Djiskstra approach ... cost = O(V2)
  3. use a PQueue to find edge to relax
In case 3, cost is dependent on efficiency of PQueue.


Searching


Searching22/45

An extremely common application in computing

As previously Keys may be ...
Applications:  Google,  databases, .....


... Searching23/45

Assume that we are dealing largely with primary keys.

Search problem can be encapsulated as:

Item search(Collection c, Key k) { ... }

Possible return values:


For secondary keys ... return an array (possibly empty) of matching items


Item *search(Collection c, key k) { ... }


... Searching24/45

Searching is a very important/frequent operation.

Many approaches have been developed to do it ...

We look at trees and hash tables in this part of course.


Searching in Linear Structures


Searching in Linear Structures26/45

Linear structures: arrays, linked lists, files

Arrays = random access.   Lists, files = sequential access.

Cost of searching:

  Array List    File   
Unsorted O(n)
(linear scan)
O(n)
(linear scan)
O(n)
(linear scan)
Sorted O(logn)
(binary search)
O(n)
(linear scan)
O(logn)
(seek,seek,...)


Although fseek() gives expensive random-access


... Searching in Linear Structures27/45

Search in unsorted array, list, file:

Item searchArray(Key k, Item a[], int n) {
   int i;
   for (i = 0; i < n; i++) {
      if (key(a[i]) == k) return a[i];
   }
   return NOT_FOUND;
}
Item searchList(Key k, List L) {
   List n;
   for (n = L; n != NULL; n = n->next) {
      if (key(n->data) == k) return n->data;
   }
   return NOT_FOUND;
}
Item searchFile(Key k, FILE *f) { // open at start
   Item it;
   while (fread(&it, sizeof(Item), 1, f) == 1) {
      if (key(it) == k) return it;
   }
   return NOT_FOUND;
}


Tree Data Structures


Trees29/45

Trees are branched data structures


[Diagram:Pics/trees/tree-small.png]


... Trees30/45

Trees can be viewed as a set of nested structures:


[Diagram:Pics/trees/subtrees-small.png]


... Trees31/45

For much of this course, we focus on binary trees (k=2)

Binary trees can be defined recursively, as follows:

A binary tree is either


... Trees32/45

Trees are used in many contexts, e.g.


[Diagram:Pics/trees/trees-small.png]


... Trees33/45

Search trees have the properties

Balanced trees have the properties We focus on binary search trees or BSTs


Binary Search Trees (BSTs, BSTrees)


Binary Search Trees35/45

Binary search trees have the characteristic properties

Operations:


... Binary Search Trees36/45

Examples of binary search trees:

[Diagram:Pics/trees/binary-search-trees-small.png]

Shape of tree is determined by order of insertion.


... Binary Search Trees37/45

Level of node = path length from root to node

Depth of tree = max path length from root to leaf

[Diagram:Pics/trees/tree-depth-small.png]

Depth of tree with n nodes:   min = floor(log2n),   max = n-1

Height balanced tree:   nodes, depth(left subtree) ≅ depth(right subtree)

Time complexity of tree algorithms is typically O(depth)


Exercise 2: Insertion into BSTs38/45

For each of the sequences below


(a)   4   2   6   5   1   7   3

(b)   5   3   6   2   4   7   1

(c)   1   2   3   4   5   6   7


Assume ...


Representing BSTs39/45

Typical data structures for trees ...

// Items (and Keys) are simply integer values
typedef int Item;
typedef int Key;

// a Tree is represented by a pointer to its root node
typedef struct node *Tree;

// a Node contains its data, plus left and right subtrees
typedef struct node {
   int data;  Tree left;  Tree right;
} Node;

// some macros that we might occasionally use
#define data(tree)  ((tree)->data)
#define left(tree)  ((tree)->left)
#define right(tree)  ((tree)->right)


... Representing BSTs40/45

Abstract data vs concrete data ...

[Diagram:Pics/trees/concrete-small.png]


Exercise 3: BSTree Operations41/45

The files BSTree.h and BSTree.c contain an ADT for binary search trees.

Using these definitions, implement the following operations:

Look at VisuAlgo first, for some ideas.

Most are best expressed recursively.  Base case?  Recursive case?

Note:   BSTreeFind() returns yes/no;   BSTreeInsert() ignores duplicates


Tree Traversal42/45

Iteration (traversal) on ...

For binary Trees, several well-defined visiting orders exist:


... Tree Traversal43/45

Consider visiting an expression tree like:

[Diagram:Pics/trees/tree1-small.png]

NLR:  + * 1 3 - * 5 7 9    (prefix-order: useful for building tree)
LNR:  1 * 3 + 5 * 7 - 9    (infix-order: "natural" order)
LRN:  1 3 * 5 7 * 9 - +    (postfix-order: useful for evaluation)
Levl:  + * - 1 3 * 9 5 7    (level-order: useful for ??)


... Tree Traversal44/45

Traversal (with parameterised visit operation):

void Traverse(Tree t, void (*visit)(Item))
{
   if (t != NULL) {
     // put "visit data" here for NLR
     Traverse(t->left, visit);
     // put "visit data" here for LNR
     Traverse(t->right, visit);
     // put "visit data" here for LRN
   }
}

Level-order cannot be implemented recursively.


Exercise 4: Generic Traversal45/45

Implement a generic tree traversal function

void BSTreeTraverse(BSTree t,
                    void (*visit)(),
                    char *order)

where


Produced: 13 Sep 2017