Course syllabus

Note: All information provided on this page is preliminary until the course starts. Each module, including the assignments may also be updated before it starts.

Course-PM

DAT570 / DIT762 Continuous optimization in data science lp2 HT24 (7.5 hp)

Course is offered by the department of Computer Science and Engineering

Contact details

For questions regarding administration, please contact the teacher and examiner, Ashkan Panahi:

ashkan.panahi@chalmers.se

For questions regarding assignments, contact the TA:

Firooz Shahriari Mehr: firooz@chalmers.se

Student Representatives

Nora Hemmingsson: norahemmingson@gmail.com

Cajsa Jacobsson: cajsajac@gmail.com

Brian Nguyen: brianphi.nguyen@gmail.com

Yanli Wang: yanliw@student.chalmers.se

Chengyu Zhu: zhuchy66@163.com

Course purpose

Behind the success of the modern data science approaches, lies the power of continuous optimization algorithms. In this course, we discuss various aspects of these algorithms and the main methods for designing and improving them in data science.

The course discusses the main aspects of optimization problems/algorithms in data science, such as convergence, statistical learning theory and etc. We present different features of optimization problems such as convexity, smoothness, constraints, etc, and show how they lead to different design strategies. We also discuss challenges in data science by presenting real-world examples, and presenting main algorithmic ideas to address them.

Schedule

There will be two lecture sessions and one exercise/consultation session every week. Please check TimeEdit for more details:

TimeEdit

Course literature

No mandatory literature. Reference literature, further reading, and other non-mandatory texts are given on the page of each module.

Course design

In the course, we will cover both theoretical subjects and practical issues involved in data science. The students are only expected to have basic math and programming skills. The course is divided to 7 weeks, where each week is about a particular subject. There will be two lecture sessions each week (typically Tuesdays and Thursdays at 13:15), and an additional last session, typically on Fridays for consultation and review of the assignments, as well as special topics per request (see below). The list of the subjects is as follows:

Week1: Continuous optimization and statistics, linear programming, and convex analysis.

Week2: Convex Vs Nonconvex optimization, Lagrangian duality and introduction to optimization algorithms

Week3: Constrained optimization algorithms: projected gradient, primal-dual algorithms and interior point methods

Week4: Non-smooth optimization algorithms: sub-gradient descent, splitting and proximal operators

Week5: Accelerated algorithms: Heavy-ball method, Nesterov acceleration and FISTA

Week6: Stochastic optimization algorithms: stochastic gradient descent, stochastic proximal-projection, stochastic variance reduction and stochastic averaging

Week7: Distributed optimization algorithms: Distributed splitting (ADMM), Gradient tracking, directed graphs

Assignments:

There will be one assignment per week covering a topic in data science related to the subject of the week. The assignment typically include few written questions (could be deriving a step in an algorithm or discussing a concept in the classroom) and programming projects. A preliminary list of topics for assignments is as follows:

Assignment 1: Gaussian Maximum Likelihood, Linear Regression and Graph Learning

Assignment 2: Linear classification: Logistic models and support vector machine

Assignment 3: Clustering by EM algorithm, projection operators, and probability simplex

Assignment4: Sparsity-based methods in data science

Assignment5: Compressed sensing, ISTA and FISTA

Assignment6: Optimal transport with stochastic optimization

General Instructions for Assignments:

The assignments will be in groups of 2 students. Find a partner and register your group in the "People" tab. You only need to register your group once during module 1. Read the instruction pdf file of each assignment carefully. There can be errors/typos in them, so if you are unsure, contact the teacher as soon as you can. Specific instructions for submissions is given on the assignment pages, but generally you need to submit a pdf file with your solutions for questions and reports of your coding projects, as well as your codes, preferably as a Python notebook. It is your responsibility to make sure that the answers are clear and the code is commented enough so that they can be easily graded.

Only one person needs to submit on behalf of both students, but the front page of your submission must include the name of both students. Each assignment is due on a Friday. This means the following schedule:

Assignment 1: November 15, 23:59

Assignment 2: November 22, 23:59

Assignment 3: November 29, 23:59

Assignment 4: December 6, 23:59

Assignment 5: December 13, 23:59

Assignment 6: December 20, 23:59

Each assignment will be available as soon as possible, but definitely before the first session of the corresponding module (on Tuesdays).

Python is the preferable language for assignment, but there can be some alternative arrangements per request.

Grading

Each assignment has 10 points and the same grade will be given to both members, except if there is a complaint by one of the members about the contribution of the other one.

The details of the grading (points) are given in the instruction pdf. The raw grade of each assignment is more than 10, but it will be divided accordingly to be adjusted. There are usually extra points too, so you can earn more than 10 points to help you in your final grade.

In order to pass the assignments you need to receive at least 4 points. If the grade is below 4, the teacher may ask for resubmission with instructions for improvement and new deadlines.

Late Submission:

You can request for an extension of one extra week, but in that case, your grade will be multiplied to 0.8 (i.e. the maximum will be 8). Under exceptional conditions, full grade (i.e. out of 10) can be granted with the extension.

Sessions on Wednesday Afternoon:

The third session of each week (usually on Wednesday at 13:15) has multiple purposes. You can attend this session for consultation about the assignment that you are currently working on. We may also discuss the assignment that was due before. The answers will not be revealed (due to the possibility of late submissions), but we may discuss other aspects of the assignment. Finally, you are encouraged to provide a suggestion for "special topics" of your interest. In the last part of the sessions, we can discuss these topics per request. There will be a thread on the discussions, where you can suggest a topic and which module it might be more suitable for.

Examination form

There will not be any final exam. To pass the course, you need to pass every assignment (i.e. receive at least 4 points in every assignment). The reported grade is as follows:

- Sufficient (3/G/40 ) for average grades in [4 6)
- Good (4/G/60) for average grades in [6 8)
- Very Good (5/VG/80) for average grades above or equal 8

Note: The grades shown on canvas might be initially different due to e.g. automatic normalization.

Re-examination: If you do not pass the course, there will be reexams in the written format.

Learning objectives and syllabus

Learning objectives:

Knowledge and understanding:

describe different types of optimization problems, such as continuous/discrete/mixed, convex/nonconvex
explain what types of data science problems can be addressed by optimization problems
explain the main principles of different optimization algorithms and their global/local convergence
account for the computational complexity of optimization algorithms in data science, as well as their performance

Skills and abilities:

implement various optimization algorithms as computer programs,
apply and adapt optimization algorithms to data science problems, such as machine learning and rule-based approaches
find approximate solutions to computationally hard problems
formulate various problems in data science as mathematical optimization problems

Judgement ability and approach:

reason about what type of information or features of the data could be useful in selecting optimization algorithms
select the appropriate evaluation methodology including performance and convergence analysis

Link to the syllabus on Studieportalen.

Study plan