40718 Machine Learning Theory

Course Description

Machine learning theory concerns questions such as: What kinds of guarantees can we prove about practical machine learning methods, and can we design algorithms achieving desired guarantees? This course answers these questions by studying the theoretical aspects of machine learning, with a focus on statistically and computationally efficient learning. Broad topics will include: PAC-learning, uniform convergence, PAC-Bayesian, model selection; supervised learning algorithms including SVM, boosting, kernel methods; online learning algorithms, ranking algorithms, and analysis; unsupervised learning with guarantees.

Course Information

Required Texts

  1. [SB] Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, 2014.

  2. [MRT] Mehryar Mohri and Afshin Rostamizadeh and Ameet Talwalkar, Foundations of Machine Learning, MIT Press, Second Edition, 2018.

  3. Please read the scribe notes for the first 10 sessions: Notes.

Optional Texts

  1. [KV] Michael Kearns and Umesh Virkumar Vazirani, An Introduction to Computational Learning Theory, MIT Press, 1994.

  2. [AB] Martin Anthony and Norman Biggs, Computational Learning Theory: An Introduction, Cambridge University Press, 1992.

Grading Policy

  1. 25%: Mid-term exam (1402/02/06).

  2. 25%: Final exam.

  3. 30%: Homeworks.

  4. 10%: Quiz.

  5. 10%: Paper & Project. Explore a theoretical or empirical question and present it. Deadline for choosing paper: 1402/02/06.

Lecture Schedule


Lecture Date Topics Related Readings and Links Homework & Assignments
1 1401-11-17Introduction: What is machine learning theory? Chapter 1 of MRT
Chapter 1 of SB
2 1401-11-24Formal model of learning and consistency model Section 2.1 of MRT
Chapter 6 of AB
3
4
1401-12-01
1401-12-06
PAC & Agnostic PAC
learning model
Sections 2.1 - 2.4 of MRT
Sections 3.1 & 3.2 & Chapter 4 of SB
5
6
1401-12-08
1401-12-13
Growth function and VC dimension
Rademacher complexity
Section 2.4 of MRT
Chapters 4 & 6 of SB
7
8
1401-12-15
1401-12-20
Nonuniform learning Chapter 7 of SB
9 1401-12-22Computational complexity of learning Chapter 8 of SB
10
11
1402-01-14
1402-01-19
Boosting Chapter 7 of MRT
Chapter 10 of SB
12
13
14
1402-01-21
1402-01-26
1402-01-28
Convex learning problemsChapters 12 and 14 of SB
15 1402-02-04Model selection Chapter 4 of MRT
Chapter 5 of SB
16
17
1402-02-04
1402-02-09
Support vector machines Chapter 5 of MRT
Section 9.1 and Chapter 15 of SB
18 1402-02-11Kernel methods Chapter 6 of MRT
Chapter 16 of SB
19
20
21
22
1402-02-16
1402-02-18
1402-02-23
1402-02-25
On-line learning Chapter 8 of MRT
Chapter 21 of SB
23
24
1402-02-30
1402-03-01
Ranking algorithms Chapter 9 of MRT
Chapter 21 of SB
25 1402-03-06Regression algorithms Chapter 11 of MRT
Chapter 11 of SB
26 1402-03-08PAC-Bayesian theory Chapter 31 of SB