MVE190 / MSG500 Linear statistical models

This page contains the program of the course: lectures, exercise sessions and computer labs. Other information, such as learning outcomes, teachers, literature and examination, are in a separate course PM. This course edition will be in-person for what concerns the lectures (no Zoom, no video-recorded lectures). For compulsory mini-analyses presentations it will be possible to connect via zoom or be in the room (see below).

List of topics and potential exam questions: 2022-23_Topics and hence potential exam questions_V3.pdf

Notes of the mid-course meeting (18 Nov 2022) are here.

Program

The schedule of the course is in TimeEdit

Lectures will be "in-person" in University rooms and will not be recorded.

We will have a single non-compulsory computer lab at 15.15-17.00 on 3 November which will NOT be via zoom. This lab is not structured with presentations of concepts. The lab is useful to get you started with the R/Rstudio software if you have no previous experience. If you already have some experience with R you probably won't need it.

Notice, there are some activities for which attendance is mandatory either in room or via zoom. These are the Fridays "mini-analyses". See the course PM for details and what to do in case you are unable to attend.

 

Lectures (below is a  plan based on last year. Deviations may occur).

(Here are the videos of the 2020 lectures. These are to be considered as useful material in case you miss a lecture, but we discourage you to rely on these videos as a substitute to attendance. Two years have passed and a few things have changed and we don't want you to get confused!)

 

week Topics slides/notes code and data files
MiniAnalysis
44

Tuesday: General intro to the course and mention of several topics: bias in linear regression, least squares; parameters interpretation in simple linear regression; the 5 basic assumptions.

 

 

 

slides_1.pdf

slides_1_annotated.pdf

lab0.pdf (you must go through this file before the Thursday lab).

Also, check Check "lecture 1" in Jörnsten's notes found in Course PM.

Lecture1.R

no minianalysis presentations this week.

44

Thursday: derivation of least squares estimates. Proof of the unbiasedness; variance of the estimators;

 

formula-sheet.pdf

slides_2.pdf(reuploaded 8 Nov)

Check "lecture 2" in Jörnsten's notes (linked in the previous lecture).

Lecture2.R

Demo1-2022.R

sleeptab.dat

no minianalysis presentations this week.

44

Thursday 3 November Computer lab at 15.15-17.00 in MVF24 and MVF25: this lab is an intro to R and Rstudio. It is not structured with presentation of concepts. We are there to help if you have questions regarding exercises.

 

lab1.pdf

repair1.txt

 

45

Tuesday: some residuals-based diagnostics; box-cox transformations; leverage values; deletion-based diagnostics;

 

slides_3.pdf

Box-Cox transf. see section 12.4 in Rawling's book.

Check "lecture 3" in Jörnsten's notes.

boxcox-type_transforms.R

also see again Lecture2.R

 

45

Thursday: MSE; unbiasedness of yhat;  expectation and variance of residuals; proof variability decomposition (SSEr, SStor, SSRegr); Rsquared; t-test construction (we did not reach the pvalues part)

 

Optional exercises for self-study from Rawling's book: exercise 1.1, exercise 1.4, relevant bits of exercise 1.9, ex. 1.10, ex. 1.16,

slides_4.pdf (updated 14 November)

slides_4_annotated.pdf

StatisticalTables.pdf

Check "lecture 4" in Jörnsten's notes.

Lecture3.R

Demo3.R

Friday 11 Nov*: present  work for  minianalysis1 MiniAnalysis1-2022.pdf

bikesharing.csv

TV.dat

*typically ~70 min only

 

46

Tuesday: standardised residuals;  pvalues; unbiasedness of MSE;   confidence intervals for parameters and for E(Y0). Prediction intervals for Ypred0. The Simpson's paradox and notation for multiple lin. regression 

slides_5.pdf

(see also the relevant bits in lectures 4-5 of Jörsten's notes)

Demo4.R

Lecture4.R in this file there are some bits we didn't do yet, such as the F-test

no minianalysis presentation this week

46

Thursday: properties of the estimators in multiple regression and sampling distributions. t-test and categorical covariates (not everything); Topics  also found in chapter 9 ("Class variables") in Rawlings et al up to sect. 9.3.

 

Optional exercises for self-study from Rawling's book: exercise 3.5(part a and d), ex. 3.10, ex. 3.11, ex. 3.12 (we never do regression without intercept...but if you are interested...), ex 3.13

slides_6.pdf

Check "lecture 10" in Jörnsten's notes.

 

advertising.R

Advertising.csv

multipleregression.R

auction.R

auction.dat

 

no minianalysis presentation this week

47

Tue: models with categorical and numerical covariates.  problems with p-values and large datasets; Multicollinearity

slides_7.pdf (I have now moved some topics to slides_8.pdf)

Check "lecture 6" in Jörnsten's notes.

Lecture6.R

SA.dat

multicollinear.R

 

 

 

47

Thu:  Variance Inflation Factor; Partial F test; greedy variables selection: backward search. Bias/variance tradoff and the pMSE. Training and testing

slides_8.pdf (some stuff has moved to slides_9.pdf)

slides_8_annotated.pdf

Check "lecture 7" in Jörnsten's notes.

Lecture7.R

global+partial_Ftests.R

Friday 25 Nov*: present minianalysis 2

MiniAnalysis2-2022.pdf

*typically  ~70 min  only

https://chalmers.zoom.us/j/66122542774
Password: 172871

 

48

Tue: PMSE with regsubsets. Then (slides_9.pdf), Mallow's Cp, interactions (barely started),

 

slides_9.pdf

slides_9_annotated.pdf

regsubsets-categorical-covar.pdf

See also "lecture 11" in Jörnsten's notes.

auction.R (minimally changed)

regsubsets-olsrr-categorical.R

no minianalysis this week
48

 

Thu: again interactions; adj-Rsquared; Kullback-Leibler; AIC, BIC; K-fold CV; LOOCV;

slides_10.pdf

(optional if you are interested) AkaikeEasyIntro.pdf

Lecture8.R

rsquaredAICstep.R

regsubsets-categorical-CV.R

no minianalysis this week

 

49

Tue: LOOCV; hat-matrix; residuals;

slides_11.pdf (Up to page 16)

slides_11_annotated.pdf

See also "lecture 14" in Jörnsten's notes.

(Additional support: Agresti's book chapters 4 and 14.4)

leverage-residuals-cook.R

f6data.txt

PROJECT is available: project22.pdf

kc_house_data.csv

controllers.dat

49

Thu: standardised residuals,  studentised residuals; Cook's distance, DFBETAs; intro GLMs; the exponential family;

slides_12.pdf

poisson_nb.R

f10.txt

f10b.txt

 

 

 

Friday 9 Dec*: present minianalysis 3

MiniAnalysis3-2022.pdf

https://chalmers.zoom.us/j/61249528380
Password: 065824

*typically  ~70 min  only

50

Tue: Newton-Raphson; Poisson regression; confidence intervals for GLMs;  asymptotic properties of the MLE; CI for predictions; Wald test; deviance; likelihood ratio test; 

 

completion of slides_12.pdf then slides_13.pdf

slides_13_annotated.pdf

poissregr-awards.R

poisson_sim.csv

 

 

no minianalysis this week
50

Thu: Poisson + offset term. Negative binomial regression also with offset. Quick tour through GLM diagnostics (diagnostics can be skipped for the exam)

 

slides_14.pdf

slides_14_annotated.pdf

shipdamage.R

poisson_nb.R

no minianalysis this week

 

Back to the top

 

Computer labs and software

Software: We will use the statistical package R to analyze data, powered via the Rstudio interface. You will need to install both on your computer, see the instructions.
No previous knowledge of R is required. You are encouraged to attend the lab on Thursday 3 November to experiment with some basic analyses. No further computer lab will be given.

Some useful resources:

If you are familiar to MATLAB or Python, the following may be useful:

 

Back to the top

Course summary:

Date Details Due