Machine Learning

Diplome(s)

Master ICFP

M2 ICFP

Lieu

ENS-PSL

Printemps- Eté

Niveau Master 2 3 ECTS - En anglais

Enseignant(s) Marc LELARGE ( ENS-PSL )

Contact - Secrétariat de l’enseignement

Tél : + 33 (1) 44 32 35 60
enseignement@phys.ens.fr

Statistical machine learning is a growing discipline at the intersection of computer science and applied mathematics (probability / statistics, optimization, etc.) and which increasingly plays an important role in many other scientific disciplines.

This course will cover supervised and unsupervised learning, as well as deep learning. The first part of the course on statistical machine learning will be focused on the analysis of data in high dimension, as well as the efficiency of algorithms to process the large amount of data encountered in multiple application areas.

The second part of the course will present the fundamental principles and methods of recent deep learning techniques and their links with theoretical physics.

Evaluation

Homework and oral final project.

Prerequisites

Proficiency in Python: please use the tutorial here for those who aren't as familiar with Python
Basic Calculus, Linear Algebra
Basic Probability and Statistics

Syllabus

1. Fundamentals of predictions and supervised learning

Fundamentals of predictions

Minimizing errors
Modeling knowledge
Prediction via optimization
Types of errors and successes
The Neyman-Pearson Lemma
Properties of ROC curves

supervised learning

Sample versus Population
A first learning algorithm: the perceptron
Connection to empirical risk minmization
Formal guarantees for the perceptron

2. Unsupervised learning: k-means and EM

K-means clustering
Mixtures of Gaussian
Expectation-Maximization for GMM

3. Kernels

Local averaging methods
- partitions estimators
- k-nearest neighbors
- kernel smoothing
Positive-definite kernel methods
- representer theorem
- kernel trick

4. Bayesian and Variational Inference

Gaussian
Linear regression
Logistic regression
Laplace method
Variational inference

5. Optimization for machine learning

gradient descent
SGD
over-parameterized models

6. Pytorch basics and autodiff

7. Autoencoders and Normalizing Flows

8. Diffusion models ddpm

Module 18a - Denoising Diffusion Probabilistic Models

Resources:

Data sciences

Physique numérique