Course syllabus
Course-PM
DAT565 / DIT407 DAT565 / DIT407 Introduction to data science and AI lp4 VT25 (7.5 hp)
Course is offered by the department of Computer Science and Engineering
Contact details
Lecturer/Repsonsible for TAs and assignments: | Dr. Oana Geman | <geman@chalmers.se> |
Lecturer/Examiner: | Dr. Moa Johansson | <moa.johansson@chalmers.se> |
Lecturer: | Dr. Stefano Sarao Manelli | <s.saraomannelli@chalmers.se> |
Lecturer: | Dr. Mohammad Kakooei | <kakooei@chalmers.se> |
Teaching Assistants (labs and marking of assignments):
Dr. Philip John Harrison | |
Filip Kronström | filipkro@chalmers.se |
Pablo Martinez Crespo | pabloma@chalmers.se |
Mathis Rost | mathisr@chalmers.se |
John Klint | gusjohn25@student.gu.se |
Georg Kyhn | kyhngeorg@gmail.com |
Madhumitha Venkatesan | madven@chalmers.se |
Hang Zou |
hangzo@chalmers.se |
Course Representatives:
TBA
Course purpose
The course gives a broad introduction to various techniques and theories used in Data Science and AI, with particular focus on their practical applications.
Schedule
The course includes 2 in-person lectures each week, as well as 2 laboratory session. The schedule and location for lectures and sessions varies each week due to holidays and other reasons beyond our control. For the latest information on time and locations for the week, please see TimeEdit.
For an overview of the schedule for lectures, labs and responsible Teaching Assistents, please see: Lectures, Labs and Assignments Schedule
Course literature
Skiena, Steven S. (2017). The Data Science Design Manual. Springer.
Available through Chalmers Network and Library at https://link.springer.com/book/10.1007/978-3-319-55444-0 (Link opens an external website).
The e-book is also available for purchase from Chalmers Store: https://www.chalmersstore.se/e-bocker/e-bok-the-data-science-design-manual.html
Additional literature can be found in their respective modules.
Course design
The course has eight mandatory assignments, where the important dates are given in Canvas. It is the student's responsibility to ensure you know when assignments are released and when they are due.
Assignment deadlines are hard and no extensions are given. The only exception is for illness or similar, in which case you must contact the examiner and relevant head TA for that assignment via email immediately to seek approval. No extensions are given for holiday trips. You will receive feedback for the assignments one week after the deadline, based on the version that was submitted by the deadline.
Programming assignments are done in pairs. Please use the Discussion board to find a partner if you don't have one. If there are any problems with your lab partener (they don't contribute, leave the course etc) you must contact the examiner immediately. These things can usually get resolved, but we want to be sure that both partners have contributed to any submitted assignments.
Most assignments are returned as Jupyter notebooks on CodeGrade. Some parts of the assignment will be automatically graded, which means you will get immediate feedback for those parts. Assignments may also be multiple-choice quizzes on Canvas. Assignments are done in pairs and you need to select your group within CodeGrade.
Quizzes are done individually. Note that Canvas logs the exact timestamps for when you reply to each question. If two people reply identically and close in time, this can be an indication of unauthorized collaboration on an individual assignment, and may be subject to disciplinary actions.
In order to pass the course, you must pass all assignments and quizzes. An assignment is considered a pass if you have obtained at least 70% of the maximum score.
Lectures are to be considered mandatory, although no attendance will be taken. Lab sessions consist of independent work on the assignments and there will be course staff available for help, but are not strictly mandatory. Lab sessions are not to be considered as replacement for lectures: you cannot expect the TAs to give you a repeated summary of lecture contents. The TAs are there to specifically help with the assignments, not provide the assumed background knowledge from lectures and readings.
There is reading attached for each lecture. You are supposed to read that before the lecture. The lectures are structured under the assumption that you have read the text in the course textbook, and the content of the textbook will not be repeated. Instead, the lectures complement the textbook and offer further examples, proofs, details etc.
Policy on generative AI
The use of text-generating tools for generating assignment reports or answers to quizzes or similar is forbidden. You are supposed to write your reports yourself.
Furthermore, it is discouraged that you use tools such as ChatGPT for "searching for content". Generative AI is very prone to hallucinating and producing convincing, yet completely wrong answers. Do not rely on such tools. You cannot know whether the answer is correct or not, unless you already know the answer. We will discuss this problem towards the end of the course.
Plagiarism policy
You are not allowed to copy pieces of code from students of other groups. You may discuss the problems, but you may not share code.
You may not publish your solutions. Do not put your code into a public GitHub repository, for example.
If you use (small amounts of) materials you find in the Internet (e.g., Wikipedia, Stack Overflow, Reddit discussions), you must attribute the source. Finding matching code snippets without proper attribution means you are presenting others’ work as your own, and is considered plagiarism. As a guideline, less than 10% of your code should fall into this category. Why? Because you need to practice writing good code yourself, it's not guaranteed that anything from the internet is as good as one thinks...
Cases where plagiarism is spotted will be deferred to the disciplinary committee of the university and may lead to suspension.
Tools & Python libraries
The following Python libraries are going to be used on this course:
If you want to run these tools on your own computer, it is strongly recommended that you acquire an environment through Anaconda. Anaconda provides support for multiple environments and package management, and can be run on the most popular operating systems (Microsoft Windows, Apple MacOS, GNU/Linux). Anaconda installation also enables easy use of Jupyter Notebooks.
JupyterHub
We have a shared JupyterHub that allows you to work on Jupyter notebooks on the department servers. This is very convenient especially for assignments 6 and 7 which require the installation of some packages, as the packages are already available on the server!
You can find out more about the cluster and how to apply for access here: https://git.chalmers.se/karppa/minerva/Links to an external site.
Once you've been given access, you can login with your CID here: http://minerva.cse.chalmers.se/jupyterLinks to an external site.
Note. The server is only accessible from within Chalmers network. Accessing it from outside Chalmers requires you to use the VPN.
Changes made since the last occasion
CodeGrade has been introduced and is being piloted for a large-scale course. The system automatically grades technical aspects of the reports that combine text, images, and code, in the form of a Jupyter Notebook.
To ensure also theoretical aspects of the syllabus are examined, some weekly assignments are replaced by individual quizzes to be taken in Canvas. More details will be provided when the course starts.
Learning objectives and syllabus
Learning objectives:
- describe fundamental types of problems and main approaches in data science and AI
- give examples of data science and AI applications from different contexts
- give examples of how stochastic models and machine learning (ML) are applied in data science and AI
- explain basic concepts in classical AI, and the relationship between logical and data driven, ML-based approaches within AI.
- briefly explain the historical development of AI, what is possible today and discuss possible future development.
- use appropriate programming libraries and techniques to implement basic transformations, visualizations and analyses of example data
- identify appropriate types of analysis problems for some concrete data science applications
- implement some types of stochastic models and apply them in data science and AI applications
- implement and/or use AI-tools for search, planning and problem solving
- apply simple machine learning methods implemented in a standard library
- justify which type of statistical method is applicable for the most common types of experiments in data science applications
- discuss advantages and drawbacks of different types of approaches and models within data science and AI.
- reflect on inherent limitations of data science methods and how the misuse of statistical techniques can lead to dubious conclusions
- critically analyze and discuss data science and AI applications with respect to ethics, privacy and societal impact
- show a reflective attitude in all learning
Link to the syllabus on Studieportalen.
https://www.chalmers.se/en/education/your-studies/find-course-and-programme-syllabi/course-syllabus/DAT565 / DIT407/?acYear=2025/2026
If the course is a joint course (Chalmers and Göteborgs Universitet) you should link to both syllabus (Chalmers and Göteborgs Universitet).
Examination form
The course is graded PASS/FAIL (G/U).
There is no particular exam. Instead, the course is considered PASS when one has passed all mandatory assignments. Assignments are done in groups of two students. Quizzes are taken individually, no collaboration is allowed.
Course summary:
Date | Details | Due |
---|---|---|