Hi all,
We have scheduled an online consultation session to help you prepare for the final exam. Details below:
Instructor:
Mingqin Yu
Date: Tuesday 28th April 2026
Time: 4:00pm-6:00pm
Join: <u>https://teams.microsoft.com/meet/43783546415754?p=6n1RjlXYkyEwKFJRQO</u>
Meeting ID:437 835 464 157 54
Passcode: K4A7BJ6i
The consultation session is completely optional to attend and may not be recorded. You can join at any time during the session, but are encouraged to make use of the session if you need any clarifications/help with the course content.
Good luck! :)
Regards,
Aditya
Hello all,
Two announcements regarding the final exam for COMP6713:
(A) Final Exam Instructions:
Make sure you are exam-ready!
* For Windows โ Your device can only have one correct SEB version installed. You will need to uninstall the incorrect SEB version prior to the exam.
During the Term 1 2026 final exam session, an onsite IT Helpdesk will be available to support students from Friday 1 May to Thursday 14 May. Find it in the John Niland Scientia Building Foyer, G19 on the Campus Map. No booking required!
Operating Hours:
(B) Sample Inspera Exam
A sample exam to familiarise yourself with Inspera and the general format of the final exam can be accessed as follows:
URL: https://unsw.inspera.com
Test code: NSW-481768826
Test password: nk4026
The sample exam will be available from 26th April 2026 10:00 AM to 30th April 2026 12:00 PM. The sample exam is completely optional and does not contribute to any marks in the course. The course team will be unable to provide feedback on your answers - this is particularly for the questions with subjective answers.
==
As always, please use the webcms forum to ask questions, if any, or email the course team at cs6713@cse.unsw.edu.au.
Regards,
Aditya
Dear all,
This is a reminder that the COMP6713 Group Project is due tomorrow, Friday, 24 April 2026 at 5:00 PM (Sydney time).
Please make sure you submit via Moodle:
https://moodle.telt.unsw.edu.au/mod/assign/view.php?id=8450977
Please also read the submission guide carefully before preparing your final submission:
https://webcms3.cse.unsw.edu.au/COMP6713/26T1/resources/120811
Quick reminders from Aditya's previous announcement:
If you have any questions, please use the WebCMS forum or email cs6713@cse.unsw.edu.au. When emailing, please Cc all your team members to avoid parallel communication.
Important: If your team does not respond to your assessorโs email, you will receive zero for the presentation component of the group project.
We hope you have enjoyed the course. All the best for your upcoming exams.
Regards,
Dipankar
Dear all,
Gentle reminder that we have a guest talk by Dr. Necva Bolucu (CSIRO) tomorrow followed by a course recap and discussion of the final exam. Necva is an inspiring researcher ( https://people.csiro.au/b/n/necva-bolucu ) and her work embodies a fantastic application of NLP for science, specifically in the government research context.
Also, the myExperience response rate has surpassed 60%. Therefore, as promised, I will discuss some hints with respect to the final exam in tomorrow's class. You are encouraged to attend in person to make the most of the guest talk and the subsequent discussion.
If you have not completed the myExperience survey yet, there is still time for you to do so. Your comments are valuable for the future editions of the course and help to ensure that the course continues to run in the future. The course team will greatly appreciate your feedback, and, if applicable, the good things you have to say about the course.
I look forward to seeing you all in the last class.
Regards,
Aditya
Hello all,
This announcement contains details of the guest talks in week 10 and the assignment marking methodology.
(A) GUEST TALKS
In week 10, our lectures will be organised as follows:
Monday, 20th April: Week 10 content (+Guest lecture by Dr. Mauel Weiss, JobAdder)
Wednesday, 22nd April: Course recap, final exam details and discussion (+Guest lecture by Dr. Necva Bolucu, CSIRO)
You are strongly encouraged to attend in person if you would like to engage with the speakers live or ask questions about the final exam. Please note that the content of the guest lectures will not be included in the final exam syllabus. However, the week 10 content and course recap/final exam will be included in the syllabus for the final exam.
Talk 1: The Model is Only 5%: Engineering Resilient AI Systems for the Real World by Dr. Manuel Weiss (JobAdder)
This talk will discuss the practical challenges of deploying AI systems in real-world settings, with a focus on issues beyond the model itself. Topics will include latency, cost considerations, security, maintenance, evaluation, and other engineering factors that are essential for building reliable AI systems in practice.
Dr. Manuel Weiss completed a PhD in Bioinformatics at the University of Zurich, and has worked as a software engineer across several companies. He spent around seven years at SEEK, where he led the recommendations team and later also the search team. He is currently Director of Engineering at JobAdder, a recruitment software company.
----
Talk 2: Tackling Noisy Annotations in Scientific Information Extraction by Dr. Necva Bolucu (CSIRO)
In real-world annotation pipelines, perfectly labelled datasets are rare. Label noise can arise from disagreements among crowdsourced annotators or inconsistencies introduced by domain experts working under ambiguous guidelines. Regardless of its source, this noise poses a serious challenge to model reliability. This issue is especially important in scientific text, where domain complexity is high and expert-quality annotation is expensive. Existing solutions remain limited, often either discarding noisy samples and losing useful signal, or treating all annotations uniformly. This talk presents an adaptive approach that distinguishes clean samples from noisy ones and leverages both through weighted weakly supervised learning, showing consistent improvements across scientific IE benchmarks.
Dr. Necva Bolucu is currently a Research Scientist at CSIRO. Her research focuses on natural language processing, deep learning, and the scientific domain. Her work includes information extraction, domain adaptation, knowledge extraction, and downstream NLP applications in scientific text.
---
(B) ASSIGNMENT MARKING METHODOLOGY
The manual grading was conducted by the course team using the detailed marking rubric shared with the class. Graders had the freedom to err on the side of caution when deemed appropriate, and have also added their notes wherever applicable. The automatic grading was conducted using 30 curated test cases, and the marking criteria were implemented using appropriate NLP techniques (and you have learned all of them in this course! :) ). The marks across each criterion are the average of the test cases. The autograder ran successfully and produced non-zero marks for around 98% of submissions.
We have received assignment review requests from ~15 students, and plan to address them in late week 10/week 11. Any updates will be reflected in Moodle once the review is complete.
If you have reported a concern regarding receiving zero marks for auto-grading, we will also attempt to make minor changes to fix your code. However, as can be expected, you will only receive partial marks, in case the tests pass.
Please reach out to the cs6713 mailing list <cs6713@cse.unsw.edu.au> in case of further issues.
--
I thank the tutors (and the course admin, of course!) for their meticulous effort marking the assignments.
I hope to see all of you in class in week 10! I wish you all the best for the group project and the final exam.
Regards,
Aditya
Dear all,
The assignment marks are now available on Moodle. Please take a moment to check your mark and read any comments provided.
The average mark for the assignment is
17.9/25
and the highest is
21.8/25!
Best wishes for your project submission :)
Regards,
Dipankar
Hi all,
Hope you have enjoyed the course so far.
We are nearing the end of the term, and it is time to reflect on your experience in COMP6713. On behalf of the course team, I invite you to complete the myExperience survey. You can find it by following the link in the email you would have received or by visiting https://go.blueja.io/LZV7zUAGGUqrJQDkzCUAYw
myExperience responses are anonymous, and allow us to know how well we did and to get suggestions to make the course better. All feedback is extremely valuable even if your feedback is "Everything was ok, don't change a thing". :)
The response rate is 3% right now. I will share additional details (more than what I would otherwise) regarding the final exam in the last lecture if the response rate has reached 60% by then.
Regards,
Aditya
Dear all,
Hope you are enjoying COMP6713 so far! :)
The submission and marking guide for the group project is now available here: https://webcms3.cse.unsw.edu.au/COMP6713/26T1/resources/120811
Please read the above guide carefully before preparing your final submission.
Quick reminders:
If you have any questions, please use the WebCMS forum or email cs6713@cse.unsw.edu.au (please remember to Cc all your team members to avoid parallel communication.).
All the best, and see you all in class next week!
Regards,
Aditya, Dipankar
[Minor edit to the notice. The next lecture is not on 2nd April but 8th April. Apologies for the error. ]
Hi all,
To make up for the missed lecture next Monday (due to the public holiday), I have recorded the first part of the Machine Translation module: https://echo360.net.au/lesson/99167dc4-09b4-4fc4-ba5c-6c2f2c229651/classroom
The recording can also be accessed via the course's Echo360 page: the one where you usually watch lecture recordings.
Our next in-person lecture will be on Wednesday, 8th April at 9:00 am.
Wish you all a good Easter break!
Regards,
Aditya
Hi all,
Hope the flexi week went well for you. :)
Please note that the guest talks (including the one today) will only be a part of the 2-hour lecture. This means that I will start with a module on POS tagging & NER during one half of today's lecture and continue with the module on Wednesday. The module is a part of the syllabus as mentioned in the course outline.
The above holds for all lectures with guest talks hereafter: Only the guest talk is not included in the syllabus for the course.
Please use the class forum to post questions, if any.
Regards,
Aditya
Hello all,
We will have our first guest talk this term, delivered by Dr. Raj Dabre from Google Sydney. The talk will be delivered in person on Monday, 30 March 2026, and you are strongly encourage to attend in person if you wish to engage with the speaker. Please note that the content of this talk will not be included in the final exam syllabus .
Details of the talk are below:
-------
Title : Multilingual Expansion of Large Language Models
Abstract : Large language models do not necessarily support all languages of interest, and retraining them from scratch is a costly process. In this talk, we will cover some simple yet impactful approaches to expanding and improving language coverage of LLMs. The focus will be on synthetic data creation through Romanization and Translation, as well as strongly convex vocabulary expansion. This talk aims to serve as an entry point and encourage further exploration into low-resource languages and multilingual LLMs.
About Dr. Dabre : Dr. Raj Dabre is a Research Scientist at Google and a Visiting Faculty at IIT Madras. His research interests include low-resource languages and large language models.
-------
Note from Aditya: Raj is an old friend - we did our Masters together, and he is known to be a fantastic orator. So, I strongly recommend that you attend the talk in person, if you are able to.
Regards,
COMP6713 Teaching Team
Dear all,
[If you have registered a project team already, this notice does not apply to you.]
This is a gentle reminder that the form to indicate that you are looking for a project team (please refer to the webcms notice posted by the course admin for the link to the form) is due by 18 March 2026, 5:00pm (Sydney time). At the time of sending this email, I have assigned project groups to all students who have filled the form.
After the deadline: We will not be assigning any group project teams, and the project scope form will also not accept any responses. This implies that you will not be able to complete the group project and consequently (and unfortunately) receive zero marks for the group project component of the course.
Please reach out to cs6713@cse.unsw.edu.au if you have any questions.
Regards,
Aditya
Dear students,
If you have not yet joined a group for the COMP6713 project, please complete the form linked below so that we can assign you to a project group.
Form link: https://forms.office.com/r/EkfUcbkxBR
Deadline to complete the form:
18 March 2026, 5:00pm (Sydney time)
After the deadline, the course team will allocate students into groups based on the responses received.
Once groups have been assigned, each group will have 5 days to submit their project description through the project registration form. Groups may also choose to upload their scope document at that time.
Please note that groups that do not submit their project description within this 5-day window will not be able to proceed with the group project , as the course team needs this information to plan project assessment and allocate assessors.
If you already have a registered group, you do not need to complete this form.
If you have any questions, please contact cs6713@cse.unsw.edu.au.
Regards,
Dipankar Srirag
Aditya Joshi
Dear Students,
To help your computing requirements for the group projects in COMP6713, we are providing a Google Cloud Coupon worth USD 50, made available as Google Cloud Education Credits. Below is the URL you will need to access in order to request a Google Cloud coupon. You will be asked to provide your school email address and name. An email will be sent to you to confirm these details before a coupon is sent to you.
Please contact me if you have any questions or issues.
Please note that the course team is unable to provide technical support to set up or run your Google Cloud account.
Regards,
Aditya
Hello everyone,
Hope you are enjoying the course so far and are excited to apply what you have been learning to the group project .
If you have not already done so, this is a good time to finalise your project groups and register them using the project registration form: https://forms.office.com/r/c5R5BddBZF
As a reminder:
The group registration deadline is: 13 March 2026, 11:59 pm (Sydney time).
Along with registering your group, you may also upload your completed scope document through the same form. Uploading the scope document at this stage is optional , but we strongly encourage it . If you submit it, the course team will have the opportunity to:
This can help ensure that the project is appropriately scoped for a 3-week effort by a team of 5 students . The scope document template is available on WebCMS . Only one person per team should submit the form and upload the scope document on behalf of the group.
If you have any questions, please contact
cs6713@cse.unsw.edu.au
.
All the best!
~ Dipankar
Assignment: Response to questions (2/n)
Hello everyone,
Thank you for the questions and discussion on the forum. I would like to clarify one point from the previous announcement regarding the execution environment during automatic evaluation.
You may assume that internet access will be available during grading . This means that if your implementation requires downloading permitted resources at runtime (for example, pretrained embeddings available through the gensim downloader), your code may do so.
However, the following constraints from the assignment specification still apply:
As a reminder, your submission must be self-contained and include only:
packaged together inside zid.zip .
If your code requires downloading permitted resources, please ensure that this is handled automatically within your program and does not require any manual setup.
Regards,
Dipankar
Hello everyone,
It is great to see that you started working on the assignment, and have similar questions, particularly around what counts as training data and what external resources are allowed.
For this assignment, the dataset we provided (data.csv) should be used to build any models, statistics, or heuristics that help you identify which words in a sentence should be replaced. You are free to use any reasonable method for this part, but please do not use large pretrained transformer models such as BERT, GPT, etc .
Regarding pretrained embeddings and lexical resources , you are allowed to use tools such as GloVe, gensim embeddings, spaCy (including models like en_core_web_sm), NLTK and WordNet . These can be used to help identify suitable replacement words or compute semantic similarity. However, they should not be used as additional training data for your model.
Please also make sure that your submission is self-sufficient and runs without any manual changes to it when running automatic tests. Your submission should contain only:
packaged together in zid.zip.
Finally, we have uploaded a small set of unit tests on Moodle. The unit tests are provided only for your own qualitative testing , and you are allowed to add your examples to the file. You can use them to check whether your approach produces reasonable outputs. The automatic marking tests and the corresponding setup is not being shared.
Hope this helps clarify things. If you still have questions, please feel free to post on the forum. All the best!
Regards,
Dipankar
Dear all,
The individual assignment (description, data, etc.) is now available on Moodle. The assignment is due by Friday, 20 March 2026, 5:00 PM .
Please upload your submissionto the appropriate place in Moodle, and do not email your files to the course team.
Good luck!
Regards,
Aditya
Hi all,
I will be running a consultation every Tuesday from 13:30 to 14:30 starting this week until week 10.
Please utilise this time to seek clarifications about either the technical content or admin details of COMP6713.
In-person: 217B in K17 building UNSW Kensington
Online: Please email cs6713@cse.unsw.edu.au close the time to schedule an online call.
Regards,
Aditya
Dear all,
Welcome to COMP6713: Natural Language Processing in 2026 Term 1!
Over the next ten weeks, we will explore and understand how computers process human (natural) language. The COMP6713 team is led by lecturer-in-charge: Aditya Joshi , with the help of course administrator Dipankar Srirag (myself), and tutors: Austin, Amrita, Freya, Liangji, Martin, Mingqin, Rahul .
Some quick reminders:
Our first lecture is on Monday, 16th February, 2026, at 4:00pm in Mathews Theatre B . See you there!
Regards,
Dipankar Srirag
Aditya Joshi (Aditya/Adi)