MVE190 / MSG500 Linear statistical models

Course PM

This page contains the program of the course: lectures, exercise sessions and computer labs. Other information, such as learning outcomes, teachers, literature and examination, are in a separate course PM.

List of topics and potential exam questions: 2021-22_Topics and hence potential exam questions.pdf

StatisticalTables.pdf

Midcourse meeting notes: Minutes of the mid-course meeting 2021.pdf

 

Program

The schedule of the course is in TimeEdit. The course will take place entirely in-person in University rooms, NOT via zoom (with the exception of the compulsory mini-analyses presentations which can be attended in person or via zoom). 

The course will be given in a hybrid fashion that is both in "in-person" in University rooms and also also live-streamed on zoom. Links are given below. Lectures will be recorded, and recordings will be posted on this page, however students should not count on these recordings as a reliable substitute to in-person attendance (the lecturer will use the blackboard in addition to slides, but may not make use of directional cameras in the rooms, therefore some blackboard writing will likely not be easily visible). Therefore in-person attendance is recommended.

We will have a single non-compulsory computer lab at 15.15-17.00 on 4 November which will NOT be via zoom. This lab is not structured with presentations of concepts. The lab is useful to get you started with the R/Rstudio software if you have no previous experience. If you already have some experience with R you probably won't need it.

Notice, there are some activities for which attendance is mandatory either in room or via zoom. These are the Fridays "mini-analyses". See the course PM for details and what to do in case you are unable to attend.

 

Lectures (below is a  plan based on last year. Deviations may occur).

Notice: regarding video lectures, you can (i) download them as MP4 files, by pointing the mouse over the video preview, then right button click "Save as"; (ii) alternatively you can stream those. I have noticed Canvas compresses the video in a way that, occasionally, some text might be slightly blurry. A downloaded version should be of higher quality.

week Topics zoom links and video lectures
slides/notes code and data files
MiniAnalysis
44

Tuesday: General intro to the course and mention of several topics: bias in linear regression, least squares; parameters interpretation in simple linear regression; the 5 basic assumptions.

 

 

 

lab0.pdf (you must go through this file before the Thursday lab).

slides_1.pdf

Also, check Check "lecture 1" in Jörnsten's notes.

Lecture1.R

no minianalysis presentations this week.

44

Thursday: derivation of least squares estimates. Proof of the unbiasedness; variance of the estimators;

 

formula sheet for properties of expectation, variance etc

slides_2.pdf

slides_2_annotated.pdf

Check "lecture 2" in Jörnsten's notes (linked in the previous lecture).

Demo1-2021.R

Lecture2.R

sleeptab.dat

no minianalysis presentations this week.

44

Thursday Computer lab at 15.15-17.00 in MVF24 and MVF25: this lab is not structured with presentation of concepts. We are there to help if you have questions regarding exercises.

 

 NO ZOOM (this is in-person)

lab1.pdf

repair1.txt

 

45

Tuesday: some residuals-based diagnostics; box-cox transformations; leverage values; deletion-based diagnostics; MSE;

 

slides_3.pdf

slides_3_annotated.pdf

Box-Cox transf. see section 12.4 in Rawling's book.

Check "lecture 3" in Jörnsten's notes.

Demo1-2021.R (updated)

boxcox-type_transforms.R

 

45

Thursday: unbiasedness of yhat;  expectation and variance of residuals; standardised residuals; unbiasedness of MSE; proof variability decomposition (SSEr, SStor, SSRegr); Rsquared; t-test construction

 

Optional exercises for self-study from Rawling's book: exercise 1.1, exercise 1.4, relevant bits of exercise 1.9, ex. 1.10, ex. 1.16,

slides_4.pdf

slides_4_annotated.pdf

Check "lecture 4" in Jörnsten's notes.

Demo3.R

Lecture3.R

Friday 12 Nov*: present  work for  minianalysis1

*typically 1 hour only

MiniAnalysis1-2021.pdf

AirBnB_NYCity_2019.csv

TV.dat

zoom (in-room presence is recommended)

46

Tuesday:  confidence intervals for parameters and for E(Y0). Prediction intervals for Ypred0. The Simpson's paradox and notation for multiple lin. regression 

slides_5.pdf

slides_5_annotated.pdf

(see also the relevant bits in lectures 4-5 of Jörsten's notes)

Demo4.R

Advertising.csv

advertising.R

Lecture4.R in this file (there are some bits we didn't do yet, such as the F-test)

no minianalysis presentation this week

46

Thursday: properties of the estimators in multiple regression and sampling distributions. Confidence intervals, t-test and categorical covariates (not everything); problems with p-values and large datasets; Topics  also found in chapter 9 ("Class variables") in Rawlings et al up to sect. 9.3.

 

Optional exercises for self-study from Rawling's book: exercise 3.5(part a and d), ex. 3.10, ex. 3.11, ex. 3.12 (we never do regression without intercept...but if you are interested...), ex 3.13

 

 

slides_6.pdf

slides_6_annotated.pdf

Check "lecture 10" in Jörnsten's notes.

 

multipleregression.R

auction.R

auction.dat

 

 

no minianalysis presentation this week

47

Tue: models with categorical and numerical covariates. Multicollinearity and VIF. Partial F test.

 

slides_7.pdf

slides_7_annotated.pdf

Check "lecture 6" in Jörnsten's notes.

auction.R (updated!)

Lecture6.R

SA.dat

multicollinear.R

global+partial_Ftests.R

 

47

Thu: greedy variables selection: backward search. Global F test. Bias/variance tradoff and the pMSE. Training and testing

 

 

 

slides_8.pdf

slides_8_annotated.pdf

Check "lecture 7" in Jörnsten's notes.

Lecture7.R

Friday 26 Nov*: present minianalysis 2

*typically 1 hour only

MiniAnalysis2-2021.pdf

 https://chalmers.zoom.us/j/63885730848
  Password: 172698

48

Tue: (from slides_8.pdf) PMSE with regsubsets. Then (slides_9.pdf), Mallow's Cp, interactions, adj-Rsquared,

 

 

The above it's only the first 45 minutes, sorry! :( I am going to upload a recording from last year too, to compensate...

Below is a chunk from last year.

 

And the last chunk from 2020

 

slides_9.pdf

slides_9_annotated.pdf

(optional if you are interested) AkaikeEasyIntro.pdf

See also "lecture 11" in Jörnsten's notes.

regsubsets-olsrr-categorical.R

regsubsets-categorical-covar.pdf

48

 

Thu: Kullback-Leibler; AIC, BIC; K-fold CV; LOOCV;

slides_10.pdf

slides_annotated_10.pdf

Lecture8.R

rsquaredAICstep.R

regsubsets-categorical-CV.R

Friday 3 Dec*: present minianalysis 3

MiniAnalysis3-2021.pdf

regsubsets-categorical-covar.pdf

regsubsets-olsrr-categorical.R

PROJECT (NEW): project21.pdf

CarPrice.csv

apprentice.txt

49

Tue: hat-matrix; residuals; standardised residuals,  studentised residuals; Cook's distance, DFBETAs; intro GLMs

 

 

slides_11.pdf

See also "lecture 14" in Jörnsten's notes.

(Additional support: Agresti's book chapters 4 and 14.4)

leverage-residuals-cook.R

f6data.txt

 

 

 

49

Thu:  the exponential family; Newton-Raphson; Poisson regression; confidence intervals for GLMs;  asymptotic properties of the MLE;

 

slides_12.pdf

slides_12_annotated.pdf

poisson_nb.R

f10.txt

f10b.txt

poissregr-awards.R

poisson_sim.csv

 

50

Tue: CI for predictions; Wald test; deviance; likelihood ratio test;  Poisson + offset term.

 

 

slides_13_annotated.pdf

slides_13.pdf

shipdamage.R

(we also rediscussed poisson_nb.R and poissregr-awards.R previously uploaded=

50

Thu: (there WILL be lecture)

Negative binomial regression also with offset. Quick tour through GLM diagnostics

 

slides_14.pdf

poisson_nb.R 

(updated from line 172 onward with a comparison of residuals for Poisson vs NB regression)

 

Back to the top

 

Computer labs and software

Software: We will use the statistical package R to analyze data, powered via the Rstudio interface. You will need to install both on your computer, see the instructions.
No previous knowledge of R is required. But you are encouraged to attend the lab on Thursday 4 November. No further computer lab will be given.

Some useful resources:

If you are familiar to MATLAB or Python, the following may be useful:

 

Back to the top

Course summary:

Date Details Due