Laurent Oudre

Machine Learning for Time Series (Master MVA)

Teaching material and outline of the course Machine Learning for Time Series (Master MVA) during 2023-2024.

Course description

In many application contexts (health, economy, advertising...), the data collected takes the form of time series. The fundamental challenge then consists in choosing an adapted representation, allowing to take into account the temporal information as well as possible.

Machine Learning for time series gathers a large number of unsupervised or supervised tasks such as prediction, classification, completion/interpolation, query by content/indexation, clustering, segmentation/change-point detection or anomaly detection. But in reality, most of work for a data scientist dealing with temporal data consists in a series of hidden tasks such as:

Understand the data: know where they come from, how they were acquired, what are their characteristics, interact with domain-experts and understand their problems
Improve the data: find accurate representation spaces where the events of interest can be seen, consolidate the data (denoising, detrending, detection/removal of outliers)
Model the data: physical/statistical or expert-based models, simple, adaptive and interpretable models
Extract information from the data: find repetitive patterns, features of interest, change-points

This course aims to provide an overview of ML techniques to study time series, in different tasks such as pattern extraction and recognition, anomaly detection, prediction, interpolation etc. The course will mostly focus on these often poorly-documented hidden tasks and introduce several recent ML methods that will help the future data scientist to mine, but above all to understand time series. The course will be widely illustrated on real data and problems from current challenges and will emphasize aspects related to data understanding and interpretation. Note that in its current form, the course will only marginally discuss Deep Learning algorithms.

Outline and planning

Lectures will take place on Thursday afternoons at ENS Paris Saclay. Lectures will be on-site (ENS Paris Saclay) and will NOT be filmed or recorded. Lectures will be in French but all material (slides, homeworks...) is in English. For the tutorial sessions two options will be available: Thursday mornings will be remote on Zoom and Thursday afternoons will be onsite at ENS Paris Saclay. Attendance to the lectures and tutorial sessions is mandatory.

05/10/2023 14:00 → 17:00 Amphi Hodgkins (0I10)	Introduction
05/10/2023 14:00 → 17:00 Amphi Hodgkins (0I10)	Lecture 1: Pattern Recognition and Detection Problem statement Comparing time series Euclidean distance Normalized Euclidean distance Dynamic Time Warping (DTW) Detecting patterns in time series Euclidean distance DTW Learning patterns from time series Distance-based pattern extraction Dictionary-based pattern extraction
12/10/2023 14:00 → 17:00 Amphi Hodgkins (0I10)	Lecture 2: Feature Extraction and Selection Problem statement Feature extraction Stationarity and ergodicity Statistical features Spectral features Local symbolic features Information theory features Deep Learning features Other features Feature selection Unsupervised setting Supervised setting
19/10/2023 09:00 → 12:00 Zoom OR 19/10/2023 14:00 → 17:00 Amphi Hodgkins (0I10)	Tutorial 1 on Lectures 1 & 2
26/10/2023 14:00 → 17:00 Amphi Hodgkins (0I10)	Lecture 3: Models and Representation Learning Problem statement Standard models Sinusoidal model Trend+Seasonality model AR models (and variants) Hidden Markov model Representation learning Standard representations Notion of sparsity Sparse coding Dictionary learning
09/11/2023 14:00 → 17:00 Amphi Hodgkins (0I10)	Lecture 4: Data Enhancement and Preprocessings Problem statement Denoising Filtering Sparse approximations Low-rank approximations Other techniques Detrending Least-Square regression Other techniques Interpolation of missing samples Polynomial interpolation Low-rank interpolation Model-based interpolation Outlier removal Isolated samples Contiguous samples
16/11/2023 09:00 → 12:00 Zoom OR 16/11/2023 14:00 → 17:00 Amphi Hodgkins (0I10)	Tutorial 2 on Lectures 3 & 4
23/11/2023 14:00 → 17:00 1Z14	Lecture 5: Change-Point and Anomaly Detection Problem statement Change point detection Dealing with non-stationary time series Problem statement Cost functions Search methods Finding the number of change points Anomaly detection Outlier detection Statistical methods Model-based methods Distance-based methods Evaluation of event detection methods
30/11/2023 14:00 → 17:00 Amphi Hodgkins (0I10)	Lecture 6: Multivariate Time Series Problem statement Models for multivariate time series Vector autoregressive models Multivariate dictionary learning Graph signal processing Concepts and definitions Graph Fourier Transform (GFT) Bandlimitedness and smoothness Graph filtering Graph learning
07/12/2023 09:00 → 12:00 Zoom OR 07/12/2023 14:00 → 17:00 Amphi Hodgkins (0I10)	Tutorial 3 on Lectures 5 & 6
20/12/2023 (all day) 22/12/2023 (afternoon) 11/01/2024 (all day) 12/01/2024 (all day) Zoom	Oral presentations

Registration and mailing list

A registration form will be sent to all MVA students to subscribe to the course mailing-list. The final registration date is set to October 15th 2023 : no registration will be allowed after this date.

Tutorials

Tutorials will consist in interactive sessions where the students will be introduced to useful Python packages for time series analysis and have the opportunity to use and apply the different algorithms studied during the lectures. After each tutorial, students will be asked to work in pairs on a small project that will consist in (almost) direct applications of the algorithms seen in the tutorial sessions. Students are required to bring their own personal computer during the tutorial sessions : details on installation and required configuration will be provided before the first tutorial session. Attendance at at least one of the two tutorials sessions (on-site or remote) is mandatory: absences must be justified, otherwise you will receive a FAIL. Missed or late assignments will also give you a FAIL in the course.

Validation

Tutorials (25%): commented notebooks and/or PDF reports
Mini-project (75%): Choice of one paper on a topic related to the course. A list of possible topics/projects will be provided, but students can bring their own topics. In this case, they must contact the lecturer in advance for approval. Mini-projects will be done in pairs.

Report (25%): PDF file, 5 pages - a template will be provided
Source code (25%): commented Jupyter notebook
Oral presentation (25%): 10 min presentation with slides

Additional ressources

Useful references
List of possible topics/projects

Machine Learning for Time Series (Master MVA)

Course description

Outline and planning

Introduction

Lecture 1: Pattern Recognition and Detection

Lecture 2: Feature Extraction and Selection

Tutorial 1 on Lectures 1 & 2

Lecture 3: Models and Representation Learning

Lecture 4: Data Enhancement and Preprocessings

Tutorial 2 on Lectures 3 & 4

Lecture 5: Change-Point and Anomaly Detection

Lecture 6: Multivariate Time Series

Tutorial 3 on Lectures 5 & 6

Oral presentations

Registration and mailing list

Tutorials

Validation

Additional ressources