Week 02a: Dynamic Data Structures


Memory1/69

Reminder:

Computer memory … large array of consecutive data cells or bytes
  • char   … 1 byte
  • int,float   … 4 bytes
  • double   … 8 bytes
  • any_type *   … 4 bytes (on CSE lab computers)
Memory addresses shown in Hexadecimal notation

[Diagram:Pic/memoryAddresses-small.png]


C execution: Memory2/69

An executing C program partitions memory into:


... C execution: Memory3/69

[Diagram:Pic/memory-small.png]


Exercise #1: Memory Regions4/69


int numbers[] = { 40, 20, 30 };

void insertionSort(int array[], int n) {
   int i, j;
   for (i = 1; i < n; i++) {
      int element = array[i];
         while (j >= 0 && array[j] > element) {
         array[j+1] = array[j];
         j--;
      }
      array[j+1] = element;
   }
}

int main(void) {
   insertionSort(numbers, 3);
   return 0;
}

Which memory region are the following objects located in?

  1. insertionSort()
  2. numbers[0]
  3. n
  4. array[0]
  5. element


  1. code
  2. global
  3. stack
  4. global
  5. stack


Dynamic Data Structures


Dynamic Memory Allocation7/69

So far, we have considered static memory allocation

Examples:

int   x;     // 4 bytes containing a 32-bit integer value
char *cp;    // 4 bytes (on CSE machines)
             //   containing address of a char
typedef struct {float x; float y;} Point;
Point p;     // 8 bytes containing two 32-bit float values
char  s[20]; // array containing space for 20 1-byte chars


... Dynamic Memory Allocation8/69

In many applications, fixed-size data is ok.

In many other applications, we need flexibility.

Examples:

char name[MAXNAME];   // how long is a name?
char item[MAXITEMS];  // how high can the stack grow?
char dictionary[MAXWORDS][MAXWORDLENGTH];
                     // how many words are there?
                     // how long is each word?

With fixed-size data, we need to guess sizes  ("large enough").


... Dynamic Memory Allocation9/69

Fixed-size memory allocation:

Dynamic memory allocation: But how to do this in C?


Dynamic Data Example10/69

Problem:

Example input:   6 25 -1 999 42 -16 64

How to define the vector?


... Dynamic Data Example11/69

Suggestion #1: allocate a large vector; use only part of it

#define MAXELEMS 1000

// how many elements in the vector
int numberOfElems;
scanf("%d", &numberOfElems);
assert(numberOfElems <= MAXELEMS);

// declare vector and fill with user input
int i, vector[MAXELEMS];
for (i = 0; i < numberOfElems; i++)
   scanf("%d", &vector[i]);

Works ok, unless too many numbers; usually wastes space.

Recall that assert() terminates program with standard error message if test fails.


... Dynamic Data Example12/69

Suggestion #2: use variables to give object sizes

// how many elements in the vector
int numberOfElems;
scanf("%d", &numberOfElems);

// declare vector and fill with user input
int i, vector[numberOfElems];
for (i = 0; i < numberOfElems; i++)
   scanf("%d", &vector[i]);

Produces compiler error   (compiler needs to know object sizes)


... Dynamic Data Example13/69

Suggestion #3: create vector after count read in

#include <stdlib.h>

// how many elements in the vector
int numberOfElems;
scanf("%d", &numberOfElems);

// declare vector and fill with user input
int i, *vector;
size_t numberOfBytes;
numberOfBytes = numberOfElems * sizeof(int);

vector = malloc(numberOfBytes);
assert(vector != NULL);

for (i = 0; i < numberOfElems; i++)
   scanf("%d", &vector[i]);

Works unless the heap is already full  (very unlikely)

Reminder: because of pointer/array connection   &vector[i] == vector+i


The malloc() function14/69

Recall memory usage within C programs:

[Diagram:Pic/memusage-small.png]


... The malloc() function15/69

malloc() function interface

void *malloc(size_t n);

What the function does:

Note: size_t is essentially an unsigned int


... The malloc() function16/69

Example use of malloc:

[Diagram:Pic/malloc1-small.png]


... The malloc() function17/69

Things to note about   void *malloc(size_t):

Required size is determined by   #Elements * sizeof(ElementType)


Exercise #2: Dynamic Memory Allocation18/69

Write code to

  1. create a dynamic m×n-matrix of floating point numbers, given m and n
  2. create space for 1,000 speeding tickets (cf. Lecture Week 1)
How many bytes need to be reserved in each case?


  1. Matrix:

    float *matrix = malloc(m * n * sizeof(float));
    assert(matrix != NULL);
    

    mn bytes allocated

  2. Speeding tickets:

    typedef struct {
    	int day, month, year;
    } DateT;
    typedef struct {
    	int hour, minute;
    } TimeT;
    typedef struct {
    	char  plate[7];
    	DateT d;
    	TimeT t;
    } TicketT;
    
    TicketT *tickets = malloc(1000 * sizeof(TicketT));
    assert(tickets != NULL);
    

    28,000 bytes allocated


Exercise #3: Memory Regions20/69

Which memory region is tickets located in? What about *tickets?


tickets is a variable located in the stack
*tickets is in the heap (after malloc'ing memory)


... The malloc() function22/69

malloc() returns a pointer to a data object of some kind.

Things to note about objects allocated by malloc():

The function free() releases objects allocated by malloc()


... The malloc() function23/69

Usage of malloc() should always be guarded:

int *vector, length, i;
…
vector = malloc(length*sizeof(int));
// but malloc() might fail to allocate
assert(vector != NULL);
// now we know it's safe to use vector[]
for (i = 0; i < length; i++) {
	… vector[i] …
}

Alternatively:


int *vector, length, i;
…
vector = malloc(length*sizeof(int));
// but malloc() might fail to allocate
if (vector == NULL) {
	fprintf(stderr, "Out of memory\n");
	exit(1);
}
// now we know its safe to use vector[]
for (i = 0; i < length; i++) {
	… vector[i] …
}


Memory Management24/69

void free(void *ptr)

Things to note:


... Memory Management25/69

Warning! Warning! Warning! Warning!

Careless use of malloc() / free() / pointers

Such errors are very difficult to track down and debug.

Must be very careful with your use of malloc() / free() / pointers.


... Memory Management26/69

If an uninitialised or otherwise invalid pointer is used, or an array is accessed with a negative or out-of-bounds index, one of a number of things might happen:

The first is the most desirable, but cannot be relied on.


... Memory Management27/69

Given a pointer variable:


... Memory Management28/69

Typical usage pattern for dynamically allocated objects:

// single dynamic object e.g. struct
Type *ptr = malloc(sizeof(Type));
assert(ptr != NULL);
… use object referenced by ptr e.g. ptr->name …
free(ptr);

// dynamic array with "nelems" elements
int nelems = NumberOfElements;
ElemType *arr = malloc(nelems*sizeof(ElemType));
assert(arr != NULL);
… use array referenced by arr e.g. arr[4] …
free(arr);


Memory Leaks29/69

Well-behaved programs do the following:

A program which does not free() each object before the last reference to it is lost contains a memory leak.

Such programs may eventually exhaust available heapspace.


Exercise #4: Dynamic Arrays30/69

Write a C-program that


Sidetrack: Standard I/O Streams, Redirects31/69

Standard file streams:

Executing a C program causes main(…) to be invoked


... Sidetrack: Standard I/O Streams, Redirects32/69

The streams stdin, stdout, stderr can be redirected


Abstract Data Types


Abstract Data Types34/69

Reminder: An abstract data type is …

E.g. does a client want/need to know how a Stack is implemented?


... Abstract Data Types35/69

ADT interface provides

ADT implementation gives ADTs are important because …


Stack as ADT36/69

Interface (in stack.h)

// provides an opaque view of ADT
typedef struct StackRep *stack;

// set up empty stack
stack newStack();
// remove unwanted stack
void dropStack(stack);
// check whether stack is empty
int StackIsEmpty(stack);
// insert an int on top of stack
void StackPush(stack, int);
// remove int from top of stack
int StackPop(stack);

ADT stack defined as a pointer to an unspecified struct


Static/Dynamic Sequences37/69

Previously we have used an array to implement a stack

The "fixed size" aspect is a potential problem: The rigid sequence is another problems:


... Static/Dynamic Sequences38/69

Inserting a value into a sorted array (insert(a,&n,4)):

[Diagram:Pic/arrayInsert-small.png]


... Static/Dynamic Sequences39/69

Deleting a value from a sorted array (delete(a,&n,3)):

[Diagram:Pic/arrayDelete-small.png]


Dynamic Sequences40/69

The problems with using arrays can be solved by

[Diagram:Pic/linkedList-small.png]

Benefits:


Self-referential Structures41/69

To realise a "chain of elements", need a node containing

In C, we can define such nodes as:

struct node {
   int data;
   struct node *next;
};


... Self-referential Structures42/69

When defining self-referential types with typedef

typedef struct node {
   int data;
   struct node *next;
} NodeT;


... Self-referential Structures43/69

Note that the following definition does not work:

typedef struct {
   int data;
   NodeT *next;
} NodeT;

Because NodeT is not yet known (to the compiler) when we try to use it to define the type of the next field.

The following is also illegal in C:

struct node {
   int data;
   struct node recursive;    
};

Because the size of the structure would have to satisfy sizeof(struct node) = sizeof(int) + sizeof(struct node) = ∞.


Linked Lists in C


Linked Lists45/69

To represent a chained (linked) list of nodes:

[Diagram:Pic/linkedList2-small.png]


... Linked Lists46/69

Linked lists are more flexible than arrays:

Disadvantages:


Memory Storage for Linked Lists47/69

Linked list nodes are typically located in the heap

Variables containing pointers to list nodes Pointers to the start of lists are often


... Memory Storage for Linked Lists48/69

Create a new list node:

NodeT *makeNode(int v) {
   NodeT *new = malloc(sizeof(NodeT));
   assert(new != NULL);
   new->data = v;       // initialise data
   new->next = NULL;    // initialise link to next node
   return new;          // return pointer to new node
}


... Memory Storage for Linked Lists49/69

[Diagram:Pic/linkedList3-small.png]


Exercise #5: Creating a linked list50/69

Write C-code to create a linked list of three nodes with values 1, 42 and 9024.


NodeT *list = createNode(1);
list->next  = createNode(42);
list->next->next = createNode(9024);


Iteration over Linked Lists52/69

When manipulating list elements

To iterate over a linked list:


... Iteration over Linked Lists53/69

Standard method for scanning all elements in a linked list:

NodeT *list;  // pointer to first Node in list
NodeT *p;     // pointer to "current" Node in list

p = list;
while (p != NULL) {
	… do something with p->data 
	p = p->next;
}

// which is frequently written as

for (p = list; p != NULL; p = p->next) {
	… do something with p->data 
}


... Iteration over Linked Lists54/69

[Diagram:Pic/listIterate1-small.png]


... Iteration over Linked Lists55/69

[Diagram:Pic/listIterate2-small.png]


... Iteration over Linked Lists56/69

Check if list contains an element:

int inLL(NodeT *list, int d) {
   NodeT *p;
   for (p = list; p != NULL; p = p->next)
      if (p->data == d)      // element found
         return 1;
   return 0;                 // element not in list
}

Print all elements:

showLL(NodeT *list) {
   NodeT *p;
   for (p = list; p != NULL; p = p->next)
      printf("%6d", p->data);
}


Exercise #6: Traversing a linked list57/69

What does this code do?

1  NodeT *p = list;
2  while (p != NULL) {
3     printf("%6d", p->data);
4     if (p->next != NULL)
5        p = p->next->next;
6     else
7        p = NULL;
9  }

What is the purpose of the conditional statement in line 4?


Every second list element is printed.

If *p happens to be the last element in the list, then p->next->next does not exist.
The if-statement ensures that we do not attempt to assign an invalid address to p in line 5.


Exercise #7: Traversing a linked list59/69

Rewrite showLL() as a recursive function.


void printLL(NodeT *list) {
   if (list != NULL) {
      printf("%6d", list->data);
      printLL(list->next);
   }
}


Modifying a Linked List61/69

Insert a new element at the beginning:

NodeT *insertLL(NodeT *list, int d) {
   NodeT *new = makeNode(d);  // create new list element
   new->next = list;          // link to beginning of list
   return new;                // new element is new head
}

Delete the first element:

NodeT *deleteHead(NodeT *list) {
   assert(list != NULL);  // ensure list is not empty
   NodeT *head = list;    // remember address of first element
   list = list->next;     // move to second element
   free(head);
   return list;           // return pointer to second element
}

What would happen if we didn't free the memory pointed to by head?


... Modifying a Linked List62/69

Delete a specific element (recursive version):

NodeT *deleteLL(NodeT *list, int d) {
   if (list == NULL) {           // element not in list
      return list;

   } else if (list->data == d) {
      return deleteHead(list);   // delete first element 

   } else {                // delete element in tail list
      list->next = deleteLL(list->next, d);
      return list;
   }
}


Exercise #8: Freeing a list63/69

Write a C-function to destroy an entire list.


Iterative version:

void freeLL(NodeT *list) {
   NodeT *p;

   p = list;
   while (p != NULL) {
      NodeT *temp = p->next;
      free(p);
      p = temp;
   }
}

Why do we need the extra variable temp?


Stack ADT Implementation65/69

Linked list implementation (stack.c):


#include <stdlib.h>
#include <assert.h>
#include "stack.h"

typedef struct node {
   int data;
   struct node *next;
} NodeT;

typedef struct StackRep {
   int    height;   // #elements on stack
   NodeT *top;      // ptr to first element
} StackRep;

// set up empty stack
stack newStack() {
   stack S = malloc(sizeof(StackRep));     
   S->height = 0;
   S->top = NULL;
   return S;
}

// remove unwanted stack
void dropStack(stack S) {
   NodeT *curr = S->top;
   while (curr != NULL) {  // free the list
      NodeT *temp = curr->next;
      free(curr);
      curr = temp;
   }
   free(S);           // free the stack rep
}







// check whether stack is empty
int StackIsEmpty(stack S) {
   return (S->height == 0);
}

// insert an int on top of stack
void StackPush(stack S, int v) {
   NodeT *new = malloc(sizeof(NodeT));  
   assert(new != NULL);
   new->data = v;
   // insert new element at top
   new->next = S->top;
   S->top = new;
   S->height++;
}

// remove int from top of stack
int StackPop(stack S) {
   assert(S->height > 0);
   NodeT *head = S->top;
   // second list element becomes new top
   S->top = S->top->next;
   S->height--;
   // read data off first element, then free
   int d = head->data;
   free(head);
   return d;
}


Doubly-linked Lists66/69

Doubly-linked lists are a variation on "standard" linked lists where each node has a pointer to the previous node as well as a pointer to the next node.

[Diagram:Pic/linked-list-types-small.png]

You can find one possible implementation of the above doubly-linked list ADT in the directory "dllist", under "Programs" for this lecture.

.


Summary: Memory Management Functions67/69

void *malloc(size_t nbytes)

Things to note:


... Summary: Memory Management Functions68/69

void free(void *ptr)

Things to note:


Summary69/69



Produced: 30 Nov 2018