Laboratory
Objectives
- to familiarise yourself with practical aspects of computationa l complexity
- to practice a systematic approach to problem solving
- to apply sound scientific reasoning in reaching conclusions
- to hone your analysis skills
- to identify algorithms from their behaviour
Assessment
- Deadline
- 20 January 2019, 23:59:59
- Marks
- 3
- Submission
give cs2521 lab06
Warning!
To be able to meaningfully compare your timing results, you must use the same (or same type of) computer to gather all your timing data. If you do half on a login server and the rest on a lab machine, you will not be able to compare your results.
In other words, you need to use one type of machine, such as a lab machine, a login server, a VLAB server, etc. for the whole task. It will not matter if you change machines within a lab, as those machines are all of the same type.
Background
A long time ago, in a burst of enthusiasm, Richard Buckland wrote a bunch of sorting programs. He forgot to give the programs meaningful names, and when he passed them on to me to use for this lab, he couldn’t tell me which program implemented which algorithm. All that he could remember* is that the algorithms he used came from the following list:
-
Oblivious Bubble Sort, an unoptimised bubble sort;
-
Bubble Sort With Early Exit, a bubble sort that terminates if there have been no exchanges in one pass;
-
Insertion Sort, a standard insertion sort;
-
Selection Sort, a standard in-place selection sort;
-
Shell Sort, Powers of Four, a Shell sort with intervals ;
-
Shell Sort, Sedgewick, a Shell sort with intervals ;
-
Psuedo-Bogo Sort, which chooses two random array elements, and swaps them if they’re out of order.
* … which means that it’s pointless trying to extract the algorithm identities by capturing and torturing him
Despite not knowing what algorithm was in each program, Richard did remember a few things about the programs:
-
all of the programs read their input from
stdin
, and write the sorted version of the data tostdout
; -
sorting happens line-by-line, like the Unix
sort
program does it. -
there is a limit on the size of the input each program can process (1,000,000 lines), because they read their input into a fixed size array, and sort it there, before writing out the sorted result;
-
the programs all expect each line to start with a number, which acts as the sorting key – the sorting is numeric, which means that the programs all behave something like the Unix
sort
program, run with the-n
option. If no number is present at the start of a line, in will be treated as a zero value. If the file has no numbers at all, the final ordering will depend on the stability of the sorting algorithm –./sortX < data > sorted_data # behaves like sort -n < data > sorted_data
One difference between Richard’s sorting programs and the Unix one is that many of his are stable, but the Unix sort is not (by default). Unless the algorithm itself is inherently unstable, Richard’s sorting programs are stable.
Your task, in two stages, is to help us identify the specific sort algorithms used in two of these programs. You will not be able to view the source code: instead you will have to try to identify the algorithms using only the programs’ observable behaviour when sorting different data. Note that since the programs are only available in binary format, they will most likely only run on the CSE machines, so you’ll have to do your work there.
In the setup phase, we will give you access to two different sort programs; each lab pair gets a different (randomly chosen) pair of programs. The first phase of the task is to design and write up the experimental framework that you plan to use to solve the problem. In the second phase, you should gather data on the execution behaviour of the two programs, according to your exerimental setup; you then add the data to your report, and analyse it to reach a conclusion on which algorithm each of the supplied programs contains.
To make your task a little simpler, we’ve supplied a program to generate data in the correct format for the sorting programs to use.
./gen Not enough arguments Usage: gen N A|D|R [S] N = number of lines A|D|R = Ascending|Descending|Random S = seed for Random ./gen 5 R 1 nwl 5 arz 4 hcd 2 rbb 3 mqb
Use the gen
program however you like, or not at all.
Note that the gen
program
always generates a unique set of keys (from 1..N).
This won’t enable you to test stability,
so you’ll need to come up with
some more data of your own.
Note that the seed argument allows you
to generate the same sequence of “random” numbers:
if you want to test both sort programs
on the same random sequence, use the same seed.
Note that the setup
script
will put a copy of the gen
executable
into your lab directory,
so you can run it as ./gen
,
rather than having to type the whole file name.
Setting up
Create a directory for this lab, change into it, and obtain your sorting programs for analysing by running:
/web/cs2521/19t0/week06/lab/setup
This must be run on the CSE lab machines, on one of the CSE login servers like wagner or weill, or on a CSE VLAB server.
This command will set up two symbolic links
called sortA
and sortB
:
these reference executable programs under the class account,
which you do not have read permission for,
in order to remove the temptation
to reverse engineer the algorithm
from the object code.
The setup
script also makes a copy of
the gen
program and of the runtests
shell script
(see below) into your lab directory.
You can check that the sortA
and sortB
programs
actually do sorting by running
something like the following:
./gen 5 R ... unsorted output ... ./gen 5 R | ./sortA ... sorted output ... ./gen 5 R | ./sortB ... sorted output ...
Only one person in the lab pair needs to run setup. If you both do it, you’ll end up with different pairs of programs. Each Lab Pair should work together on one pair of programs.
Phase 1: Designing your Experiment
Design, plan and document the set of tests which you will later use to deduce which sorting algorithm is used by each sort program. Once you have done this, show it to your tutor and explain to them how you think the tests will allow you to identify the algorithms.
This is to ensure that by the time you begin investigating and experimenting with the actual sort programs you have thoroughly thought about what kind of behaviour to look for and what further experimentation might be necessary when analysing your findings.
We expect this will involve coming up with numerous sequences of test data to use, and what differences (and why) you expect to be able to observe from different types of sorting algorithms. Typical properties to look for are execution time and output stability.
Of course, when designing tests you cannot anticipate all possible results which might occur during your experiment. This is the nature of scientific experimentation. But by formalising what you expect to occur and how you will respond, you can better account for unexpected behavior and sensibly revise your design or create new tests once the experiment is under way.
Your experimental design should detail the tests you have devised and explain, with clear reasons, how you will be able to distinguish each algorithm you might be given. You do not need to include all the unsorted input data you intend to use, only a description or small sample of it (you may put this in the appendix if you wish).
Write up the experimental design as Part 1 of your report. You can produce the report using whatever tools you want (e.g. OpenOffice, Google Docs, raw HTML, using your WebCMS blog, etc.), but it must eventually be submitted as a PDF. There is no size requirement for the report; it is content not length which counts, but as a guide we expect the average report will be 1-2 A4 pages for the experimental design and 2-3 A4 pages for the experimental results and analysis. If you want to include detailed reporting of timing results and/or output checking, then put these in an appendix. Your report should be clear, scientific/systematic in approach, and all reasoning and assumptions should be explicit. Make sure you ask your tutor questions if you are unclear about what is expected.
To help you get started, a template for the report is available. Note that a fault we often see is that students simply report observations with attempting to analyze them or explain why these results occurred. For this lab try to get beyond saying just “This is what I saw” and including “These observations can be explained by …”.
Phase 2: Running your Experiment
The setup
command has given your lab pair two sort programs to
identify. As noted earlier, each sort program reads from its standard
input and writes to its standard output, and assumes that each input
line contains a numeric key (first field) and an arbitrary string
following the key. The output should be in ascending order, based on the
numeric ordering of keys. All programs should produce the same output
for a given input when the keys are unique.
The following examples show some useful ways of running the sort programs, and auxiliary commands to help collect useful data on their behaviour:
# generate some data, put in a file called ""mydata" ./gen 100 R > mydata # count the number of lines in the data (should be 100) wc -l mydata # sort the data using sortA, put the result in "sortedA" ./sortA < mydata > sortedA # sort the data using sortB, put the result in "sortedB" ./sortA < mydata > sortedB # count the number of lines in "sortedA" (should also be 100) wc -l sortedA # sort the data using Unix sort sort -n < mydata > sorted # check that the sortA and sortB programs actaully sorted diff sorted sortedA should show no diffs diff sorted sortedB should show no diffs # run a large sort and throw away the result ./gen 100000 R | ./sortA > /dev/null # repeat the above, but get timing data on sortA ./gen 100000 R | time ./sortA > /dev/null # repeat the timing, but with better output format ./gen 100000 R | /usr/bin/time --format="%U seconds" ./sortA > /dev/null
You should now carry out the experiment you designed in Phase 1. Collect and record all of the data, and then summarize it in your report. You can use whatever tools you like to produce useful summaries (e.g. plot graphs of time vs data size).
To help with the experiments, we have provided a shell script called
runtests
to (surprise!) run tests and collect timing data. As
supplied, this script tests the built-in sort
program, which is not
helpful for you, so you’ll need to modify it to use one of your sortA
or sortB
programs. You could use it as follows:
# set up appropriate testing for sortA edit runtests # run all specified tests and writes results to log file sh runtests # save a copy of the test log for sortA mv log logA # set up appropriate testing for sortA edit runtests # run all specified tests and writes results to log file sh runtests # save a copy of the test log for sortB mv log logB
You can modify this script as much as you like to fit in with the tests that you devise. You don’t even need to use the script at all if you’re happy to run all of your test cases manually.
Note that some tests will take a l-o-n-g time to run with large data.
You can remove the large data sizes from the outer for
loop if you
can’t wait, but you should probably add more smaller sizes to get more
data points to try to determine execution cost trends. Unfortunately,
students cannot leave jobs running in background after logging out, so
you’ll need to stay logged in to a CSE machine while you’re running
tests.
Once you’ve run the tests, you might want to edit the log
files to
clean out the irrelevant stuff.
Tips for measuring: the Unix time
command works by sampling and will
likely produce different results for the same program run multiple times
(the runtests
script will do this for you). Take an average over a
number of timings to account for this. Also, beware of claiming too much
accuracy. You can’t really claim more than one or two significant digits
on an average from the time
command.
The precise format of your report is up to you, but it must include:
- a summary of the results for each program
- an argument, based on the observed behaviour, for what each program is
If you want to include detailed results, put them in an appendix.
Tips on Writing the Report
Note that these tips come from an assignment that was used in this course several years ago. Since this is a lab and not an assignment, the scale will be smaller.
Bad things
-
Not checking your spelling.
- Not proofreading your report to check for other mistakes.
- e.g. writing “sortA” when you are actually talking about sortB
- Trying to performing meaningful analysis
using very small timing data
- e.g., 0.004s vs 0.005s. Too much noise! Aim for readings in seconds, not milliseconds.
- Focusing on absolute speed,
instead of on growth rates (, , etc)
- e.g. “this ran fast”, “this ran slow” – what does that tell you? how do you account for constants?
- Trying to compare the absolute speed between your sorting algorithms
- e.g. “sortB was faster than sortA”: we could have implemented each sort in different languages or with differing levels of optimisation, so it is meaningless to compare them this way! You may have had (e.g.) a “very slow” mergesort and a “very fast” bubble sort.
-
Not being specific enough about how you decided on a particular sort algorithm
-
Writing a table full of numbers, and then leaving the units out (seconds, milliseconds, hours.. ?)
- Assuming a sort algorithm was stable for every input,
just because it was stable on
a particular input you gave it
- It is easy to show that a sort algorithm is unstable; just find an example. It is much harder to prove that a sort algorithm is stable for every possible input that you could give it.
- Being vague when you could be specific
- e.g. “much less”, “slightly better”, “increase a lot”, “fast”
-
Not being specific enough about your testing data
-
Not stating assumptions and reasons clearly
-
Making incorrect assumptions about the way our sorts were implemented (based on what?)
- Not being skeptical enough
- e.g. “obviously”, “clearly”, …
- Putting filler in your report
- e.g. putting filler in your report
- e.g. putting filler in your report
-
Not understanding how the sorting algorithms work
- Not taking report presentation seriously enough
Good things
-
Trying to check whether an algorithm was unstable
- Clearly stating your expectations, testing a hypothesis
- e.g. “We expect …”
- Being skeptical
- e.g. checking whether the sorts were actually sorting all your input, or dropping some?
- Trying to overcome timing accuracy errors
- e.g. repeating tests and taking an average, using large inputs that gave times in seconds rather than milliseconds
- Giving reasons for claims you make
- “insertion sort is stable because…”
Submitting
You must get the lab marked by your tutor in your lab.
Submit your work with the give command, like so:
give cs2521 lab06
Or, if you are working from home, upload the relevant file(s) to the lab06 activity on WebCMS3 or on Give Online.