Course syllabus

DAT340 / DIT867 Applied Machine Learning, 2022, Study Period 4 (7.5 hp)

Course responsible: Selpi

For questions about the course, please contact Selpi (selpi@chalmers.se).

Course is offered by the department of Computer Science and Engineering

Schedule

Course literature

The course will not follow a course book closely, but we'll provide some pointers to relevant material that you can read.

Some of the reading materials are from: Machine Learning: A First Course for Engineers and Scientists by Andreas Lindholm, Niklas Wahlström, Fredrik Lindsten, and Thomas B. Schön. You can access it at https://smlbook.org/ (we will use Draft version 30 April 2021).

During the course, we will publish links to additional online reading material.

For the practical coding parts of the course, we'll provide links to the API documentation of the libraries that we'll use. We'll also publish some Jupyter notebooks illustrating the usage of the libraries.

Course content

The course gives an introduction to machine learning techniques and theory, with a focus on its use in practical applications. During the course, a selection of topics will be covered in supervised learning, such as linear models for regression and classification, or nonlinear models such as neural networks, and in unsupervised learning such as clustering. The use cases and limitations of these algorithms will be discussed, and their implementation will be investigated in programming assignments. Methodological questions pertaining to the evaluation of machine learning systems will also be discussed, as well as some of the ethical questions that can arise when applying machine learning technologies.

There will be a strong emphasis on the real-world context in which machine learning systems are used. The use of machine learning components in practical applications will be exemplified, and realistic scenarios will be studied in application areas such as e-commerce, business intelligence, natural language processing, image processing, and bioinformatics. The importance of the design and selection of features, and their reliability, will be discussed.

Learning outcomes

On successful completion of the course the student will be able to:

Knowledge and understanding

describe the most common types of machine learning problems,
explain what types of problems can be addressed by machine learning, and the limitations of machine learning
account for why it is important to have informative data and features for the success of machine learning systems,
explain on a high level how different machine learning models generalize from training examples.

Skills and abilities

apply a machine learning toolkit in an application relevant to the data science area,
write the code to implement some machine learning algorithms,
apply evaluation methods to assess the quality of a machine learning system, and
compare different machine learning systems.

Judgement and approach

discuss the advantages and limitations of different machine learning models with respect to a given task,
reason about what type of information or features could be useful in a machine learning task,
select the appropriate evaluation methodology for a machine learning system and motivate this choice,
reason about ethical questions pertaining to machine learning systems.

Link to the syllabus:

Study plan - Chalmers

Study plan - GU

Assessment and grades

The course uses a numerical scale (Fail or 3–5).

In order to be awarded a Pass (3) for a full course, the grade Pass must be obtained on both the sub-courses (assignments and take-home exam, see below).
To be awarded a higher grade (4 or 5) for the full course, you need to have passed both the sub-courses with a grade of 4 or 5, respectively.

Assignments (3.5 credits).

The assignments in the course consist of solving a number of programming tasks and two written essays. The programming assignments require the students to submit their solutions (program code) to the examiner. Some assignments will additionally require the submission of a report where the results of the experiments are interpreted, and some may require the discussion of a research paper. Except for the first written assignment, the assignments are normally carried out in groups. (The allowed size of a group may vary between assignments.)

Each submitted assignment will be awarded a score, where 0 corresponds to a Fail (U) grade. For a submission be awarded a Pass (corresponding to a grade of 3), the code and report need to be clearly written, and contain only minor errors. To get a higher score (corresponding to a 4 or 5), the solutions should be correct, well explained, and more insightful; in addition, some assignments may require the student to solve additional tasks to get a higher score. Large assignments that require a longer time to solve will be given a score of up to 20 points, while smaller assignments will be given a score of up to 10 points.

If you submit a solution before the deadline that does not meet the requirements for a Pass grade, you will get some feedback and be asked to correct the most important errors within a stipulated period of time. The score will be reduced in this case. If you miss the deadline (or the resubmission deadline), or if the examiner believes that the solution was not an honest attempt to solve the assignment, the solution will get the grade of Fail (U).

If you have received the U grade for some submitted assignment, you can submit a new attempt one week before the re-sit exam.

To get the grade Pass (3) for the assignment part of the course, all assignments must be passed. The final grade in the Assignment sub-course will be based on the total sum of scores for all the assignments, using the following thresholds:

72-90 (at least 80% of the maximal): 5

54-71.9 (at least 60% of the maximal): 4

7-53.9: 3

Take-home exam (4 credits).

The take-home exam will be given in late May.

The exam will consist of two parts: first, a "sanity check" covering a number of basic topics fundamental to machine learning. To get the grade Pass (G), the student needs to answer these questions near perfectly.

The second part of the exam consists of questions that require a deeper understanding of the topics and more independent thinking. Students that give correct answers to most of the questions in the second part will receive a higher grade (4 or 5).

General Rules and Policies

The deadlines for submitting assignments are firm. Delays must be motivated before the deadlines. Unannounced late submissions will not be considered.
It is allowed, even encouraged, to discuss the assignments during the course. Also, do not hesitate to ask if you have difficulties with the assignments, or if something is unclear.
You must write your final solutions on your own, using your own words, and expressing them in the way you understood them yourself.
Submitting others' work in your own name is cheating! It can lead to severe consequences, in very bad cases even suspension from studies.
Specifically, it is prohibited to copy (with or without modifications) from each other, from books, articles, web pages, etc., and to submit solutions that you got from other persons, unless you explicitly acknowledge the sources and add your own explanations. We will be particularly watchful if exercises appear as (alleged) innocent questions in internet forums.
You are also responsible for not giving others the opportunity to copy from your work. We will not investigate who copied from whom.
For a more detailed explanation about cooperation and cheating, see this document. Here is some more information about academic integrity and honesty at Chalmers and at GU.