Laboratory

Objectives

to familiarise yourself with practical aspects of computationa l complexity
to practice a systematic approach to problem solving
to apply sound scientific reasoning in reaching conclusions
to hone your analysis skills
to identify algorithms from their behaviour

Assessment

Deadline: 20 January 2019, 23:59:59
Marks: 3
Submission: give cs2521 lab06

Warning!

To be able to meaningfully compare your timing results, you must use the same (or same type of) computer to gather all your timing data. If you do half on a login server and the rest on a lab machine, you will not be able to compare your results.

In other words, you need to use one type of machine, such as a lab machine, a login server, a VLAB server, etc. for the whole task. It will not matter if you change machines within a lab, as those machines are all of the same type.

Background

A long time ago, in a burst of enthusiasm, Richard Buckland wrote a bunch of sorting programs. He forgot to give the programs meaningful names, and when he passed them on to me to use for this lab, he couldn’t tell me which program implemented which algorithm. All that he could remember^* is that the algorithms he used came from the following list:

Oblivious Bubble Sort, an unoptimised bubble sort;
Bubble Sort With Early Exit, a bubble sort that terminates if there have been no exchanges in one pass;
Insertion Sort, a standard insertion sort;
Selection Sort, a standard in-place selection sort;
Shell Sort, Powers of Four, a Shell sort with intervals $\ldots, 4096, 1024, 256, 64, 16, 4, 1$ ;
Shell Sort, Sedgewick, a Shell sort with intervals $\ldots, 4193, 1073, 281, 23, 8, 1$ ;
Psuedo-Bogo Sort, which chooses two random array elements, and swaps them if they’re out of order.

^* … which means that it’s pointless trying to extract the algorithm identities by capturing and torturing him

Despite not knowing what algorithm was in each program, Richard did remember a few things about the programs:

all of the programs read their input from stdin, and write the sorted version of the data to stdout;
sorting happens line-by-line, like the Unix sort program does it.
there is a limit on the size of the input each program can process (1,000,000 lines), because they read their input into a fixed size array, and sort it there, before writing out the sorted result;
the programs all expect each line to start with a number, which acts as the sorting key – the sorting is numeric, which means that the programs all behave something like the Unix sort program, run with the -n option. If no number is present at the start of a line, in will be treated as a zero value. If the file has no numbers at all, the final ordering will depend on the stability of the sorting algorithm –
```
./sortX < data > sorted_data
# behaves like
sort -n < data > sorted_data
```

One difference between Richard’s sorting programs and the Unix one is that many of his are stable, but the Unix sort is not (by default). Unless the algorithm itself is inherently unstable, Richard’s sorting programs are stable.

Your task, in two stages, is to help us identify the specific sort algorithms used in two of these programs. You will not be able to view the source code: instead you will have to try to identify the algorithms using only the programs’ observable behaviour when sorting different data. Note that since the programs are only available in binary format, they will most likely only run on the CSE machines, so you’ll have to do your work there.

In the setup phase, we will give you access to two different sort programs; each lab pair gets a different (randomly chosen) pair of programs. The first phase of the task is to design and write up the experimental framework that you plan to use to solve the problem. In the second phase, you should gather data on the execution behaviour of the two programs, according to your exerimental setup; you then add the data to your report, and analyse it to reach a conclusion on which algorithm each of the supplied programs contains.

To make your task a little simpler, we’ve supplied a program to generate data in the correct format for the sorting programs to use.

./gen
Not enough arguments
Usage: gen  N  A|D|R  [S]
       N = number of lines
       A|D|R = Ascending|Descending|Random
       S = seed for Random
./gen 5 R
1 nwl
5 arz
4 hcd
2 rbb
3 mqb

Use the gen program however you like, or not at all. Note that the gen program always generates a unique set of keys (from 1..N). This won’t enable you to test stability, so you’ll need to come up with some more data of your own. Note that the seed argument allows you to generate the same sequence of “random” numbers: if you want to test both sort programs on the same random sequence, use the same seed.

Note that the setup script will put a copy of the gen executable into your lab directory, so you can run it as ./gen, rather than having to type the whole file name.

Setting up

Create a directory for this lab, change into it, and obtain your sorting programs for analysing by running:

/web/cs2521/19t0/week06/lab/setup

This must be run on the CSE lab machines, on one of the CSE login servers like wagner or weill, or on a CSE VLAB server.

This command will set up two symbolic links called sortA and sortB: these reference executable programs under the class account, which you do not have read permission for, in order to remove the temptation to reverse engineer the algorithm from the object code.

The setup script also makes a copy of the gen program and of the runtests shell script (see below) into your lab directory.

You can check that the sortA and sortB programs actually do sorting by running something like the following:

./gen 5 R
... unsorted output ...
./gen 5 R | ./sortA
... sorted output ...
./gen 5 R | ./sortB
... sorted output ...

Only one person in the lab pair needs to run setup. If you both do it, you’ll end up with different pairs of programs. Each Lab Pair should work together on one pair of programs.

Phase 1: Designing your Experiment

Design, plan and document the set of tests which you will later use to deduce which sorting algorithm is used by each sort program. Once you have done this, show it to your tutor and explain to them how you think the tests will allow you to identify the algorithms.

This is to ensure that by the time you begin investigating and experimenting with the actual sort programs you have thoroughly thought about what kind of behaviour to look for and what further experimentation might be necessary when analysing your findings.

We expect this will involve coming up with numerous sequences of test data to use, and what differences (and why) you expect to be able to observe from different types of sorting algorithms. Typical properties to look for are execution time and output stability.

Of course, when designing tests you cannot anticipate all possible results which might occur during your experiment. This is the nature of scientific experimentation. But by formalising what you expect to occur and how you will respond, you can better account for unexpected behavior and sensibly revise your design or create new tests once the experiment is under way.

Your experimental design should detail the tests you have devised and explain, with clear reasons, how you will be able to distinguish each algorithm you might be given. You do not need to include all the unsorted input data you intend to use, only a description or small sample of it (you may put this in the appendix if you wish).

Write up the experimental design as Part 1 of your report. You can produce the report using whatever tools you want (e.g. OpenOffice, Google Docs, raw HTML, using your WebCMS blog, etc.), but it must eventually be submitted as a PDF. There is no size requirement for the report; it is content not length which counts, but as a guide we expect the average report will be 1-2 A4 pages for the experimental design and 2-3 A4 pages for the experimental results and analysis. If you want to include detailed reporting of timing results and/or output checking, then put these in an appendix. Your report should be clear, scientific/systematic in approach, and all reasoning and assumptions should be explicit. Make sure you ask your tutor questions if you are unclear about what is expected.

To help you get started, a template for the report is available. Note that a fault we often see is that students simply report observations with attempting to analyze them or explain why these results occurred. For this lab try to get beyond saying just “This is what I saw” and including “These observations can be explained by …”.

Phase 2: Running your Experiment

The setup command has given your lab pair two sort programs to identify. As noted earlier, each sort program reads from its standard input and writes to its standard output, and assumes that each input line contains a numeric key (first field) and an arbitrary string following the key. The output should be in ascending order, based on the numeric ordering of keys. All programs should produce the same output for a given input when the keys are unique.

The following examples show some useful ways of running the sort programs, and auxiliary commands to help collect useful data on their behaviour:

# generate some data, put in a file called ""mydata"
./gen 100 R > mydata
# count the number of lines in the data (should be 100)
wc -l mydata
# sort the data using sortA, put the result in "sortedA"
./sortA < mydata > sortedA
# sort the data using sortB, put the result in "sortedB"
./sortA < mydata > sortedB
# count the number of lines in "sortedA" (should also be 100)
wc -l sortedA
# sort the data using Unix sort
sort -n < mydata > sorted
# check that the sortA and sortB programs actaully sorted
diff sorted sortedA    should show no diffs
diff sorted sortedB    should show no diffs
# run a large sort and throw away the result
./gen 100000 R | ./sortA > /dev/null
# repeat the above, but get timing data on sortA
./gen 100000 R | time ./sortA > /dev/null
# repeat the timing, but with better output format
./gen 100000 R | /usr/bin/time --format="%U seconds" ./sortA > /dev/null

You should now carry out the experiment you designed in Phase 1. Collect and record all of the data, and then summarize it in your report. You can use whatever tools you like to produce useful summaries (e.g. plot graphs of time vs data size).

To help with the experiments, we have provided a shell script called runtests to (surprise!) run tests and collect timing data. As supplied, this script tests the built-in sort program, which is not helpful for you, so you’ll need to modify it to use one of your sortA or sortB programs. You could use it as follows:

# set up appropriate testing for sortA
edit runtests
# run all specified tests and writes results to log file
sh runtests
# save a copy of the test log for sortA
mv log logA
# set up appropriate testing for sortA
edit runtests
# run all specified tests and writes results to log file
sh runtests
# save a copy of the test log for sortB
mv log logB

You can modify this script as much as you like to fit in with the tests that you devise. You don’t even need to use the script at all if you’re happy to run all of your test cases manually.

Note that some tests will take a l-o-n-g time to run with large data. You can remove the large data sizes from the outer for loop if you can’t wait, but you should probably add more smaller sizes to get more data points to try to determine execution cost trends. Unfortunately, students cannot leave jobs running in background after logging out, so you’ll need to stay logged in to a CSE machine while you’re running tests.

Once you’ve run the tests, you might want to edit the log files to clean out the irrelevant stuff.

Tips for measuring: the Unix time command works by sampling and will likely produce different results for the same program run multiple times (the runtests script will do this for you). Take an average over a number of timings to account for this. Also, beware of claiming too much accuracy. You can’t really claim more than one or two significant digits on an average from the time command.

The precise format of your report is up to you, but it must include:

a summary of the results for each program
an argument, based on the observed behaviour, for what each program is

If you want to include detailed results, put them in an appendix.

Tips on Writing the Report

Note that these tips come from an assignment that was used in this course several years ago. Since this is a lab and not an assignment, the scale will be smaller.

Bad things

Not checking your spelling.
Not proofreading your report to check for other mistakes.
- e.g. writing “sortA” when you are actually talking about sortB
Trying to performing meaningful analysis using very small timing data
- e.g., 0.004s vs 0.005s. Too much noise! Aim for readings in seconds, not milliseconds.
Focusing on absolute speed, instead of on growth rates (O(n), O(n2), etc)
- e.g. “this ran fast”, “this ran slow” – what does that tell you? how do you account for constants?
Trying to compare the absolute speed between your sorting algorithms
- e.g. “sortB was faster than sortA”: we could have implemented each sort in different languages or with differing levels of optimisation, so it is meaningless to compare them this way! You may have had (e.g.) a “very slow” mergesort and a “very fast” bubble sort.
Not being specific enough about how you decided on a particular sort algorithm
Writing a table full of numbers, and then leaving the units out (seconds, milliseconds, hours.. ?)
Assuming a sort algorithm was stable for every input, just because it was stable on a particular input you gave it
- It is easy to show that a sort algorithm is unstable; just find an example. It is much harder to prove that a sort algorithm is stable for every possible input that you could give it.
Being vague when you could be specific
- e.g. “much less”, “slightly better”, “increase a lot”, “fast”
Not being specific enough about your testing data
Not stating assumptions and reasons clearly
Making incorrect assumptions about the way our sorts were implemented (based on what?)
Not being skeptical enough
- e.g. “obviously”, “clearly”, …
Putting filler in your report
- e.g. putting filler in your report
- e.g. putting filler in your report
Not understanding how the sorting algorithms work
Not taking report presentation seriously enough

Good things

Trying to check whether an algorithm was unstable
Clearly stating your expectations, testing a hypothesis
- e.g. “We expect …”
Being skeptical
- e.g. checking whether the sorts were actually sorting all your input, or dropping some?
Trying to overcome timing accuracy errors
- e.g. repeating tests and taking an average, using large inputs that gave times in seconds rather than milliseconds
Giving reasons for claims you make
- “insertion sort is stable because…”

Submitting

You must get the lab marked by your tutor in your lab.

Submit your work with the give command, like so:

give cs2521 lab06

Or, if you are working from home, upload the relevant file(s) to the lab06 activity on WebCMS3 or on Give Online.