Course Code | COMP3311 |
Course Title | Database Systems |
Convenor | John Shepherd |
Admin | Dylan Brotherston |
Classes |
Timetable for all classes
The lectures will be live, but not live-streamed. They will be recorded and available on YouTube and Echo360 after the lecture. |
Consultations | TBA |
Units of Credit | 6 |
Course Website | https://webcms3.cse.unsw.edu.au/COMP3311/23T1/ or http://cgi.cse.unsw.edu.au/~cs3311/23T1/ |
Handbook Entry | http://www.handbook.unsw.edu.au/undergraduate/courses/current/COMP3311.html |
Issues | email cs3311@cse.unsw.edu.au or stureps@cse.unsw.edu.au |
If you encounter any problems with the course or the way it's being run, contact the class account and we will try to fix the problem. If you'd rather not contact us directly, then contact the Student Reps .
This course aims to explore in depth the practice of developing database applications and the theory behind relational database systems. It will also give a very brief overview of the technologies used in implementing database management systems and the past, present and future of database systems.
Large data resources are critical to the functioning of just about every significant modern computer application. Hence, knowledge of how to manage them is clearly important to the IT industry. In the context of further study, COMP3311 also provides a foundation for further study in advanced database topics, such as COMP9312 Graph Data Analytics, and COMP9315 Database Systems Implementation. Database concepts are also relevant in courses such as COMP9319 Web Data Compression and Search and COMP6714 Information Retrieval and Web Search.
By the end of this course, we want you to be capable of building high-quality (correct and efficient) applications based on relational databases, to have a sound understanding of issues in managing relational database management systems, and an overview of how they work internally.
The official pre-requisite for this course is that students have taken either COMP2521 or COMP1927 .
Whatever the formal pre-reqs, we assume primarily that students have some experience with procedural programming and some knowledge of data structures.
Some assignment work will be done in Python. Database interaction requires a relatively simple background in Python. If you have no Python experience, you ought to look at one of the hundreds of online Python tutorials before the end of Week 6.
A perpetual problem for COMP3311 is that around half of the class has already covered basic data modelling techniques (specifically ER diagrams) and/or basic SQL in courses such as INFS1603 Introduction to Business Databases or COMP1531 Software Engineering Fundamentals. On the other hand, half of the class hasn't seen this material before, so we need to cover it. Those who have seen it before should treat this as revision. Don't make the mistake of thinking "I know all this stuff"; we will definitely cover these areas in more depth than you have seen them previously.
On successfully completing this course, students should be able to:
Glossary:
This course contributes to the development of the following graduate capabilities:
Graduate Capability | Acquired in |
Scholars capable of independent and collaborative enquiry, rigorous in their analysis, critique and reflection, and able to innovate by applying their knowledge and skills to the solution of novel as well as routine problems | assignments, prac work |
Entrepreneurial leaders capable of initiating and embracing innovation and change, as well as engaging and enabling others to contribute to change | - |
Professionals capable of ethical, self- directed practice and independent lifelong learning | assignments, prac work |
Global citizens who are culturally adept and capable of respecting diversity and acting in a socially just and responsible way | - |
Summary of teaching strategies in COMP3311 22T3:Videos + Slides ... introduce concepts
Most course content will be made available via pre-recorded videos and accompanying slides. Each video will be 20-40 minutes in duration and will cover a single topic. Videos will be available at the start of each week. Also at the start of each week, there will be a Weekly Preview video outlining what topics will be covered during that week, and what you should be doing during the week.
In "lectures", we will not be doing the traditional exposition of content. Rather, we assume that you have seen the content by viewing the videos, and will do examples to clarify the concepts. Lectures are intended to be interactive, so please ask questions or pose alternative solutions. Lectures will be recorded, but not live-streamed.
Each week, you will have a 90-minute session with a tutor; most of these sessions are face-to-face, some are online. There will be a set of tutorial problems available each week, covering topics from the previous week. During the tute, you can either run a Q&A with the tutor or work through some of the problems together. You are not expected to simply work through answers to all problems; there are generally too many problems to cover in 90 minutes anyway. Discussion is encouraged. Solutions will be made available to the tutorial problems in the weeks following the tute.
Prac exercises will be made available periodically on the course web site. You should work through each exercise when it becomes available, since exercises are often a pre-cursor to the assignment work that follows.
In the assignments, you will consider some of the major stages of the database application development process: implementation of a PostgreSQL application, and building components of a database application using Python. The assignments contribute 35% of the overall mark for this course.
Assignments are completed individually ; this means that you should do them yourself without assistance from others, except by asking advice from the Lecturer or Course Admin or your Tutor. Assignments are a primary vehicle for learning the material in this course. If you don't do them, or simply copy and submit someone else's work, you have wasted a valuable learning opportunity. The assignments also lay the foundation for successful performance in the final exam.
Assignments are to be submitted on-line (via give or WebCMS) before midnight on the due date. Assessment of assignments will be based on how accurately they satisfy the requirements, as determined by automatic marking, and by whether they meet efficiency requirements. We provide testing harnesses for the assignments so that you can determine whether your code is producing the correct output. The supplied tests will use one instance of the database; the auto-marking will use these same tests, but will also run tests using one or more different database instances.
Late submissions will have marks deducted from the maximum achievable mark at the rate of 0.21% of the total mark per hour that they are late (i.e., approx 5% marks off the maximum assignment mark per day). No submissions will be accepted after 5 days (i.e. a mark of zero).
Starting in Week 2, there will be quizzes on topics from previous weeks. This gives you a chance to review what you've learned on those topics. Quizzes are released on Monday mornings and are due before midnight on the following Friday (i.e. 5 days later). Each quiz will have 4 questions.
Do not submit your quiz answers late (there are heavy late penalties), or forget to submit a quiz. If you miss a quiz because of illness, you should apply for special consideration.
You should make use of the forums to help you with prac work and assignments. If you want your question answered, you need to provide details (e.g. what you tried, what you observed, the context where you tried it). It is very difficult to provide assistance without enough information on what you're doing. In particular, problems in setting up and running your PostgreSQL server are very difficult to diagnose with a lot of context. Such questions are probably best handled in a consultation, where you can share/show your screen.
One thing you should not post on the forum are examples of assignment work (e.g. "Tried this query for Q3, but it gave me this unexpected output"). Save such questions for consultations.
The Final Exam will be held in the CSE labs.
using CSE's exam environment which does not provide general internet access. The paper will be made available via a locked-down web browser and submission will be via Give. The paper itself will be primarily based on SQL and PLpgSQL.
The exam will be 3 hours in duration, and should, if you know the content well, take less than 3-hours to complete.
We will make some provision for students who are not in Australia and cannot attend an exam at UNSW, to take the exam offshore. The details of this are still being worked out.
Note: anyone caught colluding, collaborating, posting exam solutions onto any internet platform, reading solutions from any internet platform, will be reported to the Student Conduct Investigation Unit, and will most likely end up with a mark of 00FL for the course, if not exclusion from further study at UNSW.
COVID has thrown up some challenges in running CSE courses. We have adopted the above strategies because we believe they are the best way for helping you to learn in an online environment.
The Student Code of Conduct ( Information , Policy ) sets out what the University expects from students as members of the UNSW community. As well as the learning, teaching and research environment, the University aims to provide an environment that enables students to achieve their full potential and to provide an experience consistent with the University's values and guiding principles. A condition of enrolment is that students inform themselves of the University's rules and policies affecting them, and conduct themselves accordingly.
In particular, students have the responsibility to observe standards of equity and respect in dealing with every member of the University community. This applies to all activities on UNSW premises and all external activities related to study and research. This includes behaviour in person as well as behaviour on social media, for example Facebook groups set up for the purpose of discussing UNSW courses or course work. Behaviour that is considered in breach of the Student Code Policy as discriminatory, sexually inappropriate, bullying, harassing, invading another's privacy or causing any person to fear for their personal safety is serious misconduct and can lead to severe penalties, including suspension or exclusion from UNSW.
If you have any concerns, you may raise them with your lecturer, or approach the School Ethics Officer , Grievance Officer , or one of the student representatives.
Plagiarism is defined as using the words or ideas of others and presenting them as your own. UNSW and CSE treat plagiarism as academic misconduct, which means that it carries penalties as severe as being excluded from further study at UNSW. There are several on-line sources to help you understand what plagiarism is and how it is dealt with at UNSW:
Make sure that you read and understand these. Ignorance is not accepted as an excuse for plagiarism. In particular, you are also responsible that your assignment files are not accessible by anyone but you, by setting the correct permissions in your CSE directory and code repository, if using. Note also that plagiarism includes paying or asking another person to do a piece of work for you and then submitting it as your own work. Plagiarism also covers collusion: working on an individual assessment with other students. If the assignment is individual, do it yourself , or ask your tutor or the forum for help.
UNSW has an ongoing commitment to fostering a culture of learning informed by academic integrity. All UNSW staff and students have a responsibility to adhere to this principle of academic integrity. Plagiarism undermines academic integrity and is not tolerated at UNSW. Plagiarism at UNSW is defined as using the words or ideas of others and passing them off as your own. This encompasses both copying work from your fellow students (plagiarism) or asking/paying someone to do the work for you (contract cheating).
[What follows is courtesy of Stuart Prescott in Chem Eng ...]
Artificial intelligence tools such as ChatGPT, CodePilot, CoPilot, and built-in functions within Word are modern tools that are useful in some circumstances, but reliance on them is not a path to success at university or in your later career. Reaching for a calculator to add up 1+1 is possible but not professionally sustainable for an engineer or scientist (or one might say in our educated society at all!), and that is why you learnt to do that yourself at primary school. Likewise, in your degree at UNSW, we're teaching you skills that are needed for your professional life, which is a combination of some things that AI could feasibly do for you and lots of things that the AI tools cannot do for you — if we were only teaching you things that AI could do, your degree would be worthless and you wouldn't have a job in 5 years. You can therefore see that from an academic standards perspective, the output from an AI tool will be below the minimum standards expected for a course, even if you were to submit it (which you should not!). Your ability to complete later assessments where AI cannot help you will also be compromised if you've relied upon AI earlier.
It is also worth remembering what these AI tools such as ChatGPT are: they are only statistical models about how groups of words frequently appear. These AI based tools are not smart, they don't know anything other than how words are often grouped, and they most certainly do not understand any of the content from any of your courses. Some consequences of their word-statistics and non-scientific basis are:
In summary, the AI tools generate text output that is superficially reasonable, very confident sounding, and very often wrong. We are setting an expectation that our graduates should out-perform AI, meaning that it is a tool of limited academic use in your degree.
[Thanks Stuart]
If you haven't done so yet, please take the time to read the full text of
The pages below describe the policies and procedures in more detail:
Read the previous section, if you haven't already read it.
There will be three places where your learning in this course will be assessed: assignments, quizzes and the final exam.
Much as we dislike conflating the learning aspect of assignments with their assessment aspect, there will be marks for the assignment work. We would rather use the assignments entirely as learning vehicles and have no assessment associated with them, but I suspect that wouldn't result in a satisfactory outcome (i.e., nobody would do them).
There will also be a number of on-line quizzes during the semester, to be taken in your own time (but before indicated due dates), primarily as a way for you to gauge your progress in the course, but also worth some marks.
Lastly, the final exam will form the major assessment in this course and aims to test what you learned about databases during the course of the semester. The Final Exam will be held in person in the CSE labs, and will involve both a practical component (writing SQL queries on a supplied database) and a theory component (e.g., data modelling, schema normalization, etc.). In order to meet the exam hurdle requirement, you must score at least 40% (20/50) on the exam. Note that partial marks are available for all questions on the final exam, and the hurdle will be enforced after any required scaling.
The exam will contain questions that your should be able to comfortably answer in 3 hours if you know the material well. However, you will have a 4-hour window in which to complete the exam.
Item | Topics | Due | Marks | Contributes to |
Quizzes | All topics | Weeks 2,3,4,7,8,10 | 12% | 1,2,3,4,5,6,7,8 |
Assignment 1 | SQL/PLpgSQL | Week 5 | 12% | 3,4 |
Assignment 2 | Python/SQL | Week 9 | 16% | 5 |
Final Exam | All topics | Exam period | 60% | 1,2,3,4,5,6,7,8 |
Your final mark in this course will be based on the marks from the above three assessment components. Note that the exam is a hurdle, so that if you fail the exam badly enough, you cannot pass the course. The following formula describes precisely how the final mark will be computed and how the hurdle will be enforced:
quizzes = mark for on-line quizzes (out of 12) ass1 = mark for assignment 1 (out of 12) ass2 = mark for assignment 2 (out of 16) exam = mark for final exam (out of 60) okExam = exam >= 24 (after scaling) mark = ass1 + ass2 + quizzes + exam grade = HD|DN|CR|PS if mark >= 50 && okExam = FL if mark < 50 && okExam = UF if !okExam
Week | Lectures | Tutes | Prac Work | Assignments | Quizzes | Notes |
1 | Course intro, Data Modelling, ER diagrams | - |
SQLite3,
Set up PostgreSQL server |
- | - | A long note xyzzy |
2 | Relational Model, SQL DDL, Mapping ER to SQL | Data modelling, ER | Defining a database | - | Quiz 1 | - |
3 | SQL queries | ER -> SQL | SQL queries | Ass 1 released (mon) | Quiz 2 | - |
4 | PLpgSQL functions | SQL |
PLpgSQL functions
|
- | Quiz 3 | - |
5 | Triggers, Aggregates | PLpgSQL functions | Aggregates | Ass 1 due (fri) | - | - |
6 | Flexibility Week | - | - | - | - | - |
7 | DB/PL interaction, Python, Psycopg2 | Triggers | Python meets SQL | Ass 2 released (mon) | Quiz 4 | - |
8 | Functional dependencies, Normalization | Psycopg2 | - | - | Quiz 5 | - |
9 | Relational Algebra, Query execution, Tuning | Normalization | - | Ass 2 due (fri) | - | - |
10 | Transactions, Concurrency control, Database futures, Course review | Relational algebra | Sample exam |
-
|
Quiz 6 | - |
Any of the following books is a suitable textbook for this course:
Choose the one that best suits your learning style and preferences. Also, if you have access to an earlier edition of any of these books (one or two editions less than the one given above), it will be fine for this course.
The textbooks give the greatest detail on the topics covered in the course, but cover many other topics. The Course Notes have less detail, but cover exactly the course syllabus. The Lecture Slides have even less detail, but include the examples discussed in lectures.
The software systems to be used in this course are PostgreSQL, SQLite, Python and psycopg2. The documentation and manuals provided with PostgreSQL and Python are actually very good, and the SQLite and psycopg2 documentation is also reasonable, so you don't need to buy textbooks for these. However, if you feel more comfortable with a book, there are references to a range of books on the web sites for PostgreSQL , SQLite , and Python .
A general problem with technology textbooks is that they go out-of-date very quickly. Another problem is that many of them provide a brief introduction with some examples, and then give a summary of the system documentation. In general, we have found O'Reilly books tend to be better than most.
PostgreSQL will be used as the primary DBMS in this course because it is a typical example of a full-featured client-server DBMS, and has the added bonuses that (a) it has a simple extensibility model and (b) has the source code available if you want to learn more about how DBMSs work. Commercial alternatives could have been Oracle, DB2, SQL Server, although none of these are available in source code form. (MS Access is not a full-featured relational database.) The only plausible open-source alternative is MySQL, but it was not a full-featured DBMS until version 5, and has a source code base that is largely cobbled together from a number of existing open-source systems. The PostgreSQL code base, on the other hand, is the result of coherent development by a relatively small team (although the developer base has expanded in recent years).
SQLite is a very widely-used example of a serverless RDBMS and will be used for contrast with PostgreSQL. It has a similar (90% compatible) dialect of SQL to PostgreSQL, and, like PostgreSQL, supports the SQL92 standard. As an exercise in portability, we will be expecting whatever SQL statements you write for this course to work on both PostgreSQL and SQLite (although this may not always be possible).
For programming interaction with the database, we will be using Python via psycopg2.
This course is evaluated each session using the myExperience system.
In the previous offering of this courses, students noted that the Webcms3 forums were less than optimal and that it was difficult to get engaged (with other students) online.
We have re-written the Webcms3 forums (coming soon) to incorporate the requested features from other forum platforms like Piazza.
Tute classes are 1.5 hours this term, to give more chance to interact with tutors. Tutes will also be more focused on practical aspects, after students requested this in the 21T3 myExperience survey.
Students asked for lab classes as well as tutorials. This is not feasible with our current budget, but there are Prac Exercises which are essentially the same kind of thing as would be done in labs. Since the forum is njot a great mechnism for getting help on these, we will run Help sessions each week, where you can bring your practical problems and get them solved.
Finally, to try build a sense of community, have created an "Amusing Musings" page, where you can post any old (preferably database-related) ideas that you think others might find interesting or amusing.
Resource created Saturday 10 December 2022, 06:11:32 PM, last modified Tuesday 20 June 2023, 04:22:08 PM.