Course syllabus
This page contains the program of the course: lectures, exercise sessions and computer labs. Other information, such as learning outcomes, teachers, literature and examination, are in a separate course PM. This course edition will be in-person for what concerns the lectures (no Zoom, no video-recorded lectures). For compulsory mini-analyses presentations it will be possible to connect via zoom or be in the room (see below).
List of topics and potential exam questions: 2022-23_Topics and hence potential exam questions_V3.pdf
Notes of the mid-course meeting (18 Nov 2022) are here.
Program
The schedule of the course is in TimeEdit.
Lectures will be "in-person" in University rooms and will not be recorded.
We will have a single non-compulsory computer lab at 15.15-17.00 on 3 November which will NOT be via zoom. This lab is not structured with presentations of concepts. The lab is useful to get you started with the R/Rstudio software if you have no previous experience. If you already have some experience with R you probably won't need it.
Notice, there are some activities for which attendance is mandatory either in room or via zoom. These are the Fridays "mini-analyses". See the course PM for details and what to do in case you are unable to attend.
Lectures (below is a plan based on last year. Deviations may occur).
(Here are the videos of the 2020 lectures. These are to be considered as useful material in case you miss a lecture, but we discourage you to rely on these videos as a substitute to attendance. Two years have passed and a few things have changed and we don't want you to get confused!)
week | Topics | slides/notes | code and data files |
MiniAnalysis |
---|---|---|---|---|
44 |
Tuesday: General intro to the course and mention of several topics: bias in linear regression, least squares; parameters interpretation in simple linear regression; the 5 basic assumptions.
|
lab0.pdf (you must go through this file before the Thursday lab). Also, check Check "lecture 1" in Jörnsten's notes found in Course PM. |
||
44 |
Thursday: derivation of least squares estimates. Proof of the unbiasedness; variance of the estimators;
|
slides_2.pdf(reuploaded 8 Nov) Check "lecture 2" in Jörnsten's notes (linked in the previous lecture). |
no minianalysis presentations this week. |
|
44 |
Thursday 3 November Computer lab at 15.15-17.00 in MVF24 and MVF25: this lab is an intro to R and Rstudio. It is not structured with presentation of concepts. We are there to help if you have questions regarding exercises.
|
|
||
45 |
Tuesday: some residuals-based diagnostics; box-cox transformations; leverage values; deletion-based diagnostics;
|
Box-Cox transf. see section 12.4 in Rawling's book. Check "lecture 3" in Jörnsten's notes. |
also see again Lecture2.R |
|
45 |
Thursday: MSE; unbiasedness of yhat; expectation and variance of residuals; proof variability decomposition (SSEr, SStor, SSRegr); Rsquared; t-test construction (we did not reach the pvalues part)
Optional exercises for self-study from Rawling's book: exercise 1.1, exercise 1.4, relevant bits of exercise 1.9, ex. 1.10, ex. 1.16, |
slides_4.pdf (updated 14 November) Check "lecture 4" in Jörnsten's notes. |
Friday 11 Nov*: present work for minianalysis1 MiniAnalysis1-2022.pdf *typically ~70 min only
|
|
46 |
Tuesday: standardised residuals; pvalues; unbiasedness of MSE; confidence intervals for parameters and for E(Y0). Prediction intervals for Ypred0. The Simpson's paradox and notation for multiple lin. regression |
(see also the relevant bits in lectures 4-5 of Jörsten's notes) |
Lecture4.R in this file there are some bits we didn't do yet, such as the F-test |
no minianalysis presentation this week |
46 |
Thursday: properties of the estimators in multiple regression and sampling distributions. t-test and categorical covariates (not everything); Topics also found in chapter 9 ("Class variables") in Rawlings et al up to sect. 9.3.
Optional exercises for self-study from Rawling's book: exercise 3.5(part a and d), ex. 3.10, ex. 3.11, ex. 3.12 (we never do regression without intercept...but if you are interested...), ex 3.13 |
Check "lecture 10" in Jörnsten's notes.
|
|
no minianalysis presentation this week |
47 |
Tue: models with categorical and numerical covariates. problems with p-values and large datasets; Multicollinearity |
slides_7.pdf (I have now moved some topics to slides_8.pdf) Check "lecture 6" in Jörnsten's notes. |
|
|
47 |
Thu: Variance Inflation Factor; Partial F test; greedy variables selection: backward search. Bias/variance tradoff and the pMSE. Training and testing |
slides_8.pdf (some stuff has moved to slides_9.pdf) Check "lecture 7" in Jörnsten's notes. |
Friday 25 Nov*: present minianalysis 2 *typically ~70 min only https://chalmers.zoom.us/j/66122542774
|
|
48 |
Tue: PMSE with regsubsets. Then (slides_9.pdf), Mallow's Cp, interactions (barely started),
|
regsubsets-categorical-covar.pdf See also "lecture 11" in Jörnsten's notes. |
auction.R (minimally changed) |
no minianalysis this week |
48 |
Thu: again interactions; adj-Rsquared; Kullback-Leibler; AIC, BIC; K-fold CV; LOOCV; |
(optional if you are interested) AkaikeEasyIntro.pdf |
no minianalysis this week
|
|
49 |
Tue: LOOCV; hat-matrix; residuals; |
slides_11.pdf (Up to page 16) See also "lecture 14" in Jörnsten's notes. (Additional support: Agresti's book chapters 4 and 14.4) |
PROJECT is available: project22.pdf |
|
49 |
Thu: standardised residuals, studentised residuals; Cook's distance, DFBETAs; intro GLMs; the exponential family; |
slides_12.pdf |
|
Friday 9 Dec*: present minianalysis 3 https://chalmers.zoom.us/j/61249528380 *typically ~70 min only |
50 |
Tue: Newton-Raphson; Poisson regression; confidence intervals for GLMs; asymptotic properties of the MLE; CI for predictions; Wald test; deviance; likelihood ratio test;
|
completion of slides_12.pdf then slides_13.pdf |
|
no minianalysis this week |
50 |
Thu: Poisson + offset term. Negative binomial regression also with offset. Quick tour through GLM diagnostics (diagnostics can be skipped for the exam)
|
no minianalysis this week |
Computer labs and software
Software: We will use the statistical package R to analyze data, powered via the Rstudio interface. You will need to install both on your computer, see the instructions.
No previous knowledge of R is required. You are encouraged to attend the lab on Thursday 3 November to experiment with some basic analyses. No further computer lab will be given.
Some useful resources:
- R: there are lots (too many!) resources online to learn about R: here is just a possible one to get started, from Uni. Copenhagen http://r.sund.ku.dk/index.html (Links to an external site.)
If you are familiar to MATLAB or Python, the following may be useful:
- a MATLAB/R cheat sheet is at http://mathesaurus.sourceforge.net/octave-r.html (Links to an external site.)
- a MATLAB/Python/R cheat sheet: http://mathesaurus.sourceforge.net/matlab-python-xref.pdf (Links to an external site.)
Course summary:
Date | Details | Due |
---|---|---|