Course syllabus
Note: All information provided on this page is preliminary until the course starts.
Course-PM
DAT570 / DIT762 Continuous optimization in data science lp2 HT23 (7.5 hp)
Course is offered by the department of Computer Science and Engineering
Contact details
For questions, please contact the teacher and examiner, Ashkan Panahi:
Student Representatives
Amanda Levinsson amanda.levinsson@telia.com
Axel Qvarnström axel.qvarnstrom@gmail.com
Frida Sjögren fridajohannalinnea@gmail.com
Course purpose
Behind the success of the modern data science approaches, lies the power of continuous optimization algorithms. In this course, we discuss various aspects of these algorithms and the main methods for designing and improving them in data science.
The course discusses the main aspects of optimization problems/algorithms in data science, such as convergence, statistical learning theory and etc. We present different features of optimization problems such as convexity, smoothness, constraints, etc, and show how they lead to different design strategies. We also discuss challenges in data science by presenting real-world examples, and presenting main algorithmic ideas to address them.
Schedule
There will be two lecture sessions and one exercise/consultation session every week. Please check TimeEdit for more details:
Course literature
No mandatory literature. Reference literature, further reading, and other non-mandatory texts are given on the page of each module.
Course design
Week1: Continuous optimization and statistics, linear programming, and convex analysis.
Week2: Convex Vs Nonconvex optimization, Lagrangian duality and introduction to optimization algorithms
Week3: Constrained optimization algorithms: projected gradient, primal-dual algorithms and interior point methods
Week4: Non-smooth optimization algorithms: sub-gradient descent, splitting and proximal operators
Week5: Accelerated algorithms: Heavy-ball method, Nesterov acceleration and FISTA
Week6: Stochastic optimization algorithms: stochastic gradient descent, stochastic proximal-projection, stochastic variance reduction and stochastic averaging
Week7: Distributed optimization algorithms: Distributed splitting (ADMM), Gradient tracking, directed graphs
Assignments:
There will be one assignment per week covering a topic in data science related to the subject of the week. The assignment typically include few written questions (could be deriving a step in an algorithm or discussing a concept in the classroom) and programming projects. A preliminary list of topics for assignments is as follows:
Assignment 1: Maximum Likelihood in data science: Classification and Regression
Assignment 2: EM algorithm in data science: Clustering and Autoencoders
Assignment 3: Projection operators and constrained linear regression
Assignment4: Sparsity-based methods in data science
Assignment5: Momentum method in Neural Networks: A simple transformer
Assignment6: Optimal transport with stochastic optimization
Assignment 7: A simple Federated learning scheme
General Instructions for Assignments:
The assignments will be in groups of 2 students. Find a partner and register your group in the "People" tab. You only need to register your group once during module 1. Read the instruction pdf file of each assignment carefully. There can be errors/typos in them, so if you are unsure, contact the teacher as soon as you can. Specific instructions for submissions is given on the assignment pages, but generally you need to submit a pdf file with your solutions for questions and reports of your coding projects, as well as your codes, preferably as a Python notebook. It is your responsibility to make sure that the answers are clear and the code is commented enough so that they can be easily graded.
Only one person needs to submit on behalf of both students, but the front page of your submission must include the name of both students. Each assignment is due on Thursday of the week following the corresponding module (THIS RULE IS CHANGED!: The deadlines from module 2 is on Mondays). This means the following schedule:
Module 1: November 10, 23:59
Module 2: November 20, 23:59
Module 3: November 27, 23:59
Module 4: November 4, 23:59
Module 5: December 11, 23:59
Module 6: December 18, 23:59
Module 7: December 25, 23:59
Each assignment will be available as soon as possible, but definitely before the first session of the corresponding module (on Tuesdays).
Python is the preferable language for assignment, but there can be some alternative arrangements per request.
Grading
Each assignment is graded out of 100 and the same grade will be given to both members, except if there is a complaint by one of the members about the contribution of the other one. The details of the grading (points) are given in the instruction pdf. In order to pass the assignments you need to receive at least 40 points. If the grade is below 40, the teacher may ask for resubmission with instructions for improvement and new deadlines.
Late Submission:
You can request for an extension of one extra week, but in that case, your grade will be multiplied to 0.8 (i.e. the maximum will be 80). Under exceptional conditions, full grade (i.e. out of 100) can be granted with the extension.
Sessions on Fridays:
The third session of each week (usually on Fridays) has multiple purposes. You can attend this session for consultation about the assignment that you are currently working on. After consultation, we will discuss the assignment that was due the day before. The answers will not be revealed (due to the possibility of late submissions), but we may discuss other aspects of the assignment. Finally, you are encouraged to provide a suggestion for "special topics" of your interest. In the last part of the sessions, we will discuss these topics per request. There will be a thread on the discussions, where you can suggest a topic and which module it might be more suitable for.
Examination form
There will not be any final exam. To pass the course, you need to pass every assignment (i.e. receive at least 40 points in every assignment). The reported grade is as follows:
- Sufficient (3/G/40 )
- Good (4/G/60)
- Very Good (5/VG/80)
Note: The grades shown on canvas might be initially different due to e.g. automatic normalization.
Re-examination: If you do not pass the course, there will be reexams in the written format.
Learning objectives and syllabus
Learning objectives:
Knowledge and understanding:
- describe different types of optimization problems, such as continuous/discrete/mixed, convex/nonconvex
- explain what types of data science problems can be addressed by optimization problems
- explain the main principles of different optimization algorithms and their global/local convergence
- account for the computational complexity of optimization algorithms in data science, as well as their performance
Skills and abilities:
- implement various optimization algorithms as computer programs,
- apply and adapt optimization algorithms to data science problems, such as machine learning and rule-based approaches
- find approximate solutions to computationally hard problems
- formulate various problems in data science as mathematical optimization problems
Judgement ability and approach:
- reason about what type of information or features of the data could be useful in selecting optimization algorithms
- select the appropriate evaluation methodology including performance and convergence analysis
Link to the syllabus on Studieportalen.
Course summary:
Date | Details | Due |
---|---|---|