Tutorial Solutions Week 07

Exercise 1

  1. Explain what stability means in the context of sorting

    A stable sorting algorithm preserves the order of duplicates. An un-stable sorting algorithm does not always preserve the order of duplicates. For example if you had a sorting algorithm that was case insensitive and sorted the following input.

    zzzz
    AAAA
    csdf
    aaaa
    
    and the output was
    aaaa
    AAAA
    csdf
    zzzz
    
    Then the order of the duplicates "AAAA" and "aaaa" has changed after the data has been sorted. This would prove that the algorithm was unstable as we have found a counter example where stability is not preserved.

  2. Suppose you have an implementation of a sorting algorithm that sorts strings and is case insensitive (for example 'a' and 'A' are considered to be equal). Explain what is wrong with the following argument:

    I ran the following input through the program

    AAAAA
    zzzzz
    abcde
    aaaaa
    
    and the output of the program was
    AAAAA
    aaaaa
    abcde
    zzzzz
    
    This means my sorting program is stable

Just because the algorithm does not reverse the order of duplicates for this particular data set, does not mean it would not reverse the order of duplicates for any given data set. The algorithm is only stable if for all possible different inputs, the ordering of duplicates is preserved.

Exercise 2

Naive Bubble Sort

1 void bubbleSort(int items[], int n) {
2    int i, j;
3        
4    for(i = n - 1; i > 0 ; i--) {
5        for(j = 1; j <= i; j++) {
6            if (items[j] <  items[j - 1]) {
7                swap(j,j-1,items);
8            }
9        }
10    }
11}
Random order: 3,2,4,8,1
Sorted order: 1,2,3,4,5
Reverse order: 5,4,3,2,1
  1. Show how each of these arrays above change as they are sorted by the program above.
    Showing array at the start and then after each swap
    Random Order
    3 2 4 8 1
    2 3 4 8 1 
    2 3 4 1 8 
    2 3 1 4 8 
    2 1 3 4 8 
    1 2 3 4 8 
    
    Sorted Order
    1 2 3 4 5
    
    Reverse Ordered
    5 4 3 2 1
    4 5 3 2 1 
    4 3 5 2 1 
    4 3 2 5 1 
    4 3 2 1 5         
    3 4 2 1 5 
    3 2 4 1 5 
    3 2 1 4 5
    2 3 1 4 5 
    2 1 3 4 5 
    1 2 3 4 5 
    
  2. How many swaps are performed on the random, sorted, reverse ordered data sets shown above
    random : 5
    sorted : 0
    reverse :10
  3. How many comparisons are peformed on the random, sorted and reverse ordered data sets shown above. By comparison we mean comparing two data elements from the array - we are not including the loop counter comparisons. 10 comparisons no matter what for data of size 5
  4. With each line of code associate a cost and a formula expressing the number of times the C statement on that line will be executed when sorting n items in the worst case.

    Solution

    Line

    Cost

    Times

    4

    c0

    n

    5

    c1

    n + (n-1) + ... + 1 = (n (n+1)) /2

    6

    c2

    (n-1) + (n-2) + ... + 1 = (n (n-1)) /2

    7

    c3

    (n-1) + (n-2) + ... + 1 = (n (n-1)) /2

    Because in the worst case the comparison is always true so we always swap
  5. What is the asymptotic worst case time complexity of the algorithm implemented by this program. O(n^2)
  6. What is the time complexity of the algorithm in the best case? We never have to swap or execute line 7 but it is still O(n^2)
  7. Modify the program to implement bubble sort with early exit. What is the asymptotic worst case time complexity now? What is the time complexity now in the best case?
    1 void bubbleSortEE(int items[], int n) {
    2    int i, j;
    3    int done = 0; //added this line
    4    for(i = n - 1; i > 0 && !done ; i--) { //updated this line
             done = 1;           //added this line
    5        for(j = 1; j <= i; j++) {
    6            if (items[j] <  items[j - 1]) {
    7                swap(j,j-1,items);
    8                done  = 0;  //added this line
                 }
    9        }
    10    }
    11}
    
    Now in the best case when the data is already in sorted order, inner loop only needs to go through once. No swaps are done and then the outer loop finishes. This makes it O(n) in the best case. In the worst case and overall we still say it is an O(n^2) algorithm.

Exercise 3

Sorting Linked Lists

Implement selection sort, given the following definition of a linked list

typedef struct node * link;
struct node{
    Item item;
    link next;
};
link selectionSort(link list){ 
    link sorted = NULL;
    link curr = NULL;
    link prev = NULL;
    link max = NULL;
    link maxPrev = NULL;

    
    // Keep finding the max in the original list
    // and adding to the front of the sorted list
    // until the original list is empty
    while(list != NULL){
        //Find max
        prev = NULL;
        maxPrev = NULL;
        max = list;
        for(curr = list ;curr!=NULL;curr= curr->next){
           if(curr->item > max->item){
                max = curr;
                maxPrev = prev;
           }
           prev = curr;
        }
        //Remove from original list
        if(maxPrev != NULL){
            maxPrev->next = max->next;
        }else{
            list  = max->next;
        }
        // Add the max to the front of the sorted list
        max->next = sorted;
        sorted = max;
    }
    return sorted;
}

This implementation is not stable. An example of why not is that if there were duplicates, the first copy would be put a the front, then the next copy would be put at the front thus reversing the order of duplciates. By changing the code to have >= instead of > when looking for the max it would be.

  if(curr->item >= max->item){
                max = curr;
                maxPrev = prev;
  }

The implementation of selection sort using arrays from lectures is not stable as when the max/min item is selected it is swapped with the item at the appropriate location. When this swap occurs, the item that was at the appropriate location could be swapped to a position that makes it come after any of its duplicates.

Even if we were sorting something simple like 1 1 2 in descending order it would end up reversing the 1s.

This would be harder to make stable without having to shift possibly many items in the array and making the implementation less efficient.

Exercise 4

Quick Sort

The following is the implementation of quicksort and the partition function as discussed in the lectures.

   
   void quickSort (int a[], int l, int r){         	
   int i;  
   if  (r <= l) {
       return;
   } 
   i = partition (a, l, r);  
   quickSort (a, l, i-1);  
   quickSort (a, i+1, r);
}

int partition (int a[], int l, int r) {   
   int i = l-1;
   int j = r;   
   int pivot = a[r]; //rightmost is pivot  	
   for(;;) {   
	while ( a[++i] < pivot) ;    
	while ( pivot <  a[--j] && j != l);
	if (i >= j) { 
      		break;
    	}    
	swap(i,j,a);  
    }
    //put pivot into place  
    swap(i,r,a);  
    return i; //Index of the pivot
}

Trace the execution of the partition function on sorting the following input sequence of numbers: 4 7 1 1 4 6 7 2 5 6.

Trace the execution of the partition function on the following sequence of numbers: 1 2 3 4 5 6 7 8 9 10

Note that this sequence results in 10 being chosen as the pivot. This is one of the worst case scenarios for a pivot as it is larger than all the elements in the sequence.

Exercise 5

Quicksort with median of three partitioning

One improvement to the Quicksort algorithm that we discussed was the use of Median-of-Three partitioning. In this version of Quicksort, three items in the array are sampled and the median of the three values is used to partition the array. Without looking at the C code given in lectures, complete the following program by replacing the comments with the relevant C program statements.

// Quick sort with Median of Three Partitioning
void quicksortMT (Item a[], int l, int r) { 
  int i;

  if (r <= l) return;
  if(r-l > 1){
      medianOfThreePivot(a,l,r);
      i = partition (a, l+1, r-1);
  } else {
      i = partition (a, l, r);
  }

  quicksortMT (a, l, i-1);
  quicksortMT (a, i+1, r);
}

void medianOfThreePivot(int items[], int low, int high){   
       // Swap median (value mid-way between l and r) with a[r - 1]
       // Compare a[l], a[r - 1] and a[r]
       // Rearrange values:
       // lowest value to be stored in a[l]   
       // highest value to be stored in a[r]
       // median value to be stored in a[r-1]
       //  You are required to provide an implementation for this in lab 07 
}

int partition (int a[], int l, int r) {   
   int i = l-1;
   int j = r;   
   int pivot = a[r]; //rightmost is pivot  	
   for(;;) {   
	while ( a[++i] < pivot) ;    
	while ( pivot <  a[--j] && j != l);
	if (i >= j) { 
      		break;
    	}    
	swap(i,j,a);  
    }
    //put pivot into place  
    swap(i,r,a);  
    return i; //Index of the pivot
}

void swap(int index1, int index2, int items[]){
    int tmp;
    tmp = items[index1];
    items[index1] = items[index2];
    items[index2] = tmp;
}

Trace the call of medianOfThreePivot and then partition on the sequence 1 2 3 4 5 6 7 8 9 10

 Solution:
void medianOfThreePivot(int a[], int l, int r){   
       // Swap median (value mid-way between l and r) with a[r - 1]
   
       int mid = (r+l)/2;
       swap(r-1,mid,a);
       // Compare a[l], a[r - 1] and a[r]
       // Rearrange values:
       // lowest value to be stored in a[l]   
       // highest value to be stored in a[r]
       // median value to be stored in a[r-1]

       if(a[r-1] < a[l]){
           swap(r-1,l,a);
       }
       if(a[r] < a[l]){
           swap(r,l,a);
       }
       if(a[r] < a[r-1]){
           swap(r,r-1,a);
       }
}

Exercise 6

Mergesort

Among the good points of Mergesort:

Bad points of Mergesort: