CSED515/ITCE504 Machine Learning, Spring 2017


Announcements

Primary Textbook

Lectures

Dates and Titles Topics Lecture Slides Suggested Further Readings
Lecture 1
Introduction to machine learning
  • Superised learning
  • Unsupervised learning
  • Probability primer


Lecture 2
Density estimation

  • Maximum likelihood estimation
  • MAP estimation
  • Bayesian estimation


Lecture 3
Clustering I

  • k-means clustering
  • Mixture of Gaussians (MoG)


  • Murphy 11.2
  • Bishop 9.1 and 9.2
Lecture 4
Expectation Maximization

  • Jensen's inequality
  • Information theory preliminaries
  • EM optimization
  • Generalized EM
  • Incremental EM
  • EM for exponetial families


Lecture 5
Latent variable models

  • Maximum likelihood factor analysis
  • Probabilistic PCA
  • Mixture of factor analyzers
  • Mixture of probabilistic principal component analyzers
  • SVD
  • Probabilistic latent semantic analysis (PLSA)


Lecture 6
Clustering II

  • Nonnegative matrix factorization
  • Spectral clustering


  • D. D. Lee and H. S. Seung (1999),
    "Learning the Parts of Objects by Non-negative Matrix Factorization",
    Nature,
    vol. 401, pp. 788-791, 1999.
  • C. Ding, T. Li, W. Peng, and H. Park (2006),
    "Orthogonal nonnegative matrix tri-factorizations for clustering,"
    KDD-2006.
  • A. Cichocki, H. Lee, Y.-D. Kim, and S. Choi (2008),
    "Nonnegative matrix factorization with alpha-divergence,"
    Pattern Recognition Letters,
    vol. 29, no. 9, pp. 1433-1440, July 2008.
  • J. Shi and J. Malik (2000),
    "Normalized Cuts and Image Segmentation",
    IEEE Trans. Pattern Analysis and Machine Intelligence,
    vol. 22, no. 8, pp. 888-905, 2000.
  • U. von Luxburg (2007),
    "A tutorial on spectral clustering,"
    Statistics and Computing,
    vol. 17, no. 4, pp. 395-416, 2007.
Lecture 7
Regression

  • Regression
  • Linear models for regression
  • Least suares and RLS
  • Bias-variance dilemma
  • Bayesian linear regression
  • Gaussian process regression


Lecture 8
Linear models for classification

  • Bayes decision theory
  • Fisher's linear discriminant analysis
  • Logistic regression
  • Perceptron
  • Support vector machine


Lecture 9
Neural networks

  • Adaline
  • Perceptron
  • Multilayer perceptron (MLP)
  • Radial basis functoin (RBF) network
  • An overview of deep learning


  • Murphy 16.5
  • Bishop 5

Lecture 10
Mixture of experts

  • Mixture of experts (MoE)


  • Murphy 11.2.4
  • Bishop 14.5

Lecture 11
Kernel methods

  • Kernel PCA (KPCA)


  • Murphy 14.4.4
  • Bishop 12.3

Lecture 12
Hidden Markov models

  • Hidden Markov models (HMMs)


  • Chapter 13.2 in Bishops' PRML.

Lecture 13
Reinforcement learning

  • Markov decision process (MDP)
  • Value-based RL
  • Policy-based RL


  • Sutton and Barto's RL

Homework Assignments (list of old ones)

  • Hwk 1

  • Hwk 2

  • Hwk 3

  • Hwk 4

  • Hwk 5

  • Hwk 6

  • Hwk 7