Mathematics, Physics & Machine Learning Seminar   RSS

Past sessions

Gabriel Peyré 14/04/2021, 18:00 — 19:00 — Online
, École Normale Supérieure

Scaling Optimal Transport for High dimensional Learning

Optimal transport (OT) has recently gained lot of interest in machine learning. It is a natural tool to compare in a geometrically faithful way probability distributions. It finds applications in both supervised learning (using geometric loss functions) and unsupervised learning (to perform generative model fitting). OT is however plagued by the curse of dimensionality, since it might require a number of samples which grows exponentially with the dimension. In this talk, I will explain how to leverage entropic regularization methods to define computationally efficient loss functions, approximating OT with a better sample complexity.

More information and references can be found on the website of our book
"Computational Optimal Transport",

Pedro A. Santos 09/04/2021, 14:00 — 15:00 — Online
, Instituto Superior Técnico and INESC-ID

Two-time scale stochastic approximation for reinforcement learning with linear function approximation

In this presentation, I will introduce some traditional Reinforcement Learning problems and algorithms, and analyze how some problems can be avoided and convergence results obtained using a two-time scale variation of the usual stochastic approximation approach.

This variation was inspired by the practical successes of Deep Q-Learning in attaining superhuman performance at some classical Atari games by Deepmind's research team in 2015. Machine Learning practical successes like this often have no corresponding explaining theory. The work that will be presented intends to contribute to that goal.

Joint work with Diogo Carvalho and Francisco Melo from INESC-ID.

See also

Santos PA slides.pdf

Steve Brunton 31/03/2021, 18:00 — 19:00 — Online
, University of Washington

Machine learning for Fluid Mechanics

Many tasks in fluid mechanics, such as design optimization and control, are challenging because fluids are nonlinear and exhibit a large range of scales in both space and time. This range of scales necessitates exceedingly high-dimensional measurements and computational discretization to resolve all relevant features, resulting in vast data sets and time-intensive computations. Indeed, fluid dynamics is one of the original big data fields, and many high-performance computing architectures, experimental measurement techniques, and advanced data processing and visualization algorithms were driven by decades of research in fluid mechanics. Machine learning constitutes a growing set of powerful techniques to extract patterns and build models from this data, complementing the existing theoretical, numerical, and experimental efforts in fluid mechanics. In this talk, we will explore current goals and opportunities for machine learning in fluid mechanics, and we will highlight a number of recent technical advances. Because fluid dynamics is central to transportation, health, and defense systems, we will emphasize the importance of machine learning solutions that are interpretable, explainable, generalizable, and that respect known physics.

See also

Brunton slides.pdf

Markus Heyl 22/03/2021, 17:00 — 18:00 — Online
, Max-Planck Institute for the Physics of Complex Systems, Dresden

Quantum many-body dynamics in two dimensions with artificial neural networks

In the last two decades the field of nonequilibrium quantum many-body physics has seen a rapid development driven, in particular, by the remarkable progress in quantum simulators, which today provide access to dynamics in quantum matter with an unprecedented control. However, the efficient numerical simulation of nonequilibrium real-time evolution in isolated quantum matter still remains a key challenge for current computational methods especially beyond one spatial dimension. In this talk I will present a versatile and efficient machine learning inspired approach. I will first introduce the general idea of encoding quantum many-body wave functions into artificial neural networks. I will then identify and resolve key challenges for the simulation of real-time evolution, which previously imposed significant limitations on the accurate description of large systems and long-time dynamics. As a concrete example, I will consider the dynamics of the paradigmatic two-dimensional transverse field Ising model, where we observe collapse and revival oscillations of ferromagnetic order and demonstrate that the reached time scales are comparable to or exceed the capabilities of state-of-the-art tensor network methods.

See also

Heyl slides.pdf

Hsin Yuan  Huang, (Robert) 17/03/2021, 18:00 — 19:00 — Online
, Caltech

Information-theoretic bounds on quantum advantage in machine learning

We compare the complexity of training classical and quantum machine learning (ML) models for predicting outcomes of physical experiments. The experiments depend on an input parameter x and involve the execution of a (possibly unknown) quantum process $E$. Our figure of merit is the number of runs of $E$ needed during training, disregarding other measures of complexity. A classical ML performs a measurement and records the classical outcome after each run of $E$, while a quantum ML can access $E$ coherently to acquire quantum data; the classical or quantum data is then used to predict outcomes of future experiments. We prove that, for any input distribution $D(x)$, a classical ML can provide accurate predictions on average by accessing $E$ a number of times comparable to the optimal quantum ML. In contrast, for achieving accurate prediction on all inputs, we show that exponential quantum advantage exists in certain tasks. For example, to predict expectation values of all Pauli observables in an $n-$qubit system, we present a quantum ML using only $O(n)$ data and prove that a classical ML requires $2^{\Omega(n)}$ data.

See also

Huang slides.pdf

A. Pedro Aguiar 03/03/2021, 18:00 — 19:00 — Online
, Faculdade de Engenharia, Universidade do Porto

Model based control design combining Lyapunov and optimization tools: Examples in the area of motion control of autonomous robotic vehicles

The past few decades have witnessed a significant research effort in the field of Lyapunov model based control design. In parallel, optimal control and optimization model based design have also expanded their range of applications, and nowadays, receding horizon approaches can be considered a mature field for particular classes of control systems.

In this talk, I will argue that Lyapunov based techniques play an important role for analysis of model based optimization methodologies and moreover, both approaches can be combined for control design resulting in powerful frameworks with formal guarantees of robustness, stability, performance, and safety. Illustrative examples in the area of motion control of autonomous robotic vehicles will be presented for Autonomous Underwater Vehicles (AUVs), Autonomous Surface Vehicles (ASVs) and Unmanned Aerial Vehicles (UAVs).

See also

Aguiar slides.pdf

Maciej Koch-Janusz 22/02/2021, 17:00 — 18:00 — Online
, University of Zurich

Statistical physics through the lens of real-space mutual information

Identifying the relevant coarse-grained degrees of freedom in a complex physical system is a key stage in developing effective theories. The renormalization group (RG) provides a framework for this task, but its practical execution in unfamiliar systems is fraught with ad hoc choices. Machine learning approaches, on the other hand, though promising, often lack formal interpretability: it is unclear what relation, if any, the architecture- and training-dependent learned "relevant" features bear to standard objects of physical theory.
I will present recent results addressing both issues. We develop a fast algorithm, the RSMI-NE, employing state-of-art results in machine-learning-based estimation of information-theoretic quantities to construct the optimal coarse-graining. We use it to develop a new approach to identifying the most relevant field theory operators describing a statistical system, which we validate on the example of interacting dimer model. I will also discuss formal results underlying the method: we establish equivalence between the information-theoretic notion of relevance defined in the Information Bottleneck (IB) formalism of compression theory, and the field-theoretic relevance of the RG. We show analytically that for statistical physical systems the "relevant" degrees of freedom found using IB compression indeed correspond to operators with the lowest scaling dimensions, providing a dictionary connecting two distinct theoretical toolboxes.

See also

Koch-Janusz slides.pdf

Mário Figueiredo 17/02/2021, 18:00 — 19:00 — Online
, Instituto Superior Técnico and IT

Dealing with Correlated Variables in Supervised Learning

Linear (and generalized linear) regression (LR) is an old, but still essential, statistical tool: its goal is to learn to predict a (response) variable from a linear combination of other (explanatory) variables. A central problem in LR is the selection of relevant variables, because using fewer variables tends to yield better generalization and because this identification may be meaningful (e.g., which genes are relevant to predict a certain disease). In the past quarter-century, variable selection (VS) based on sparsity-inducing regularizers has been a central paradigm, the most famous example being the LASSO, which has been intensively studied,
extended, and applied.

In many contexts, it is natural to have highly-correlated variables (e.g., several genes that are strongly co-regulated), thus simultaneously relevant as predictors. In this case, sparsity-based VS may fail: it may select an arbitrary subset of these variables and it is unstable. Moreover, it is often desirable to identify all the relevant variables, not just an arbitrary subset thereof, a goal for which several approaches have been proposed. This talk will be devoted to a recent class of such approaches, called ordered weighted l1 (OWL). The key feature of OWL is that it is provably able to explicitly identify (i.e. cluster) sufficiently-correlated features, without having to compute these correlations. Several theoretical results characterizing OWL will be presented, including connections to the mathematics of economic inequality. Computational and optimization aspects will also be addressed, as well as recent applications in subspace clustering, learning Gaussian graphical models, and deep neural networks.

See also

Figueiredo slides.pdf

Caroline Uhler 10/02/2021, 18:00 — 19:00 — Online
, MIT and Institute for Data, Systems and Society

Causal Inference and Overparameterized Autoencoders in the Light of Drug Repurposing for SARS-CoV-2

Massive data collection holds the promise of a better understanding of complex phenomena and ultimately, of better decisions. An exciting opportunity in this regard stems from the growing availability of perturbation / intervention data (drugs, knockouts, overexpression, etc.) in biology. In order to obtain mechanistic insights from such data, a major challenge is the development of a framework that integrates observational and interventional data and allows predicting the effect of yet unseen interventions or transporting the effect of interventions observed in one context to another. I will present a framework for causal structure discovery based on such data and highlight the role of overparameterized autoencoders. We end by demonstrating how these ideas can be applied for drug repurposing in the current SARS-CoV-2 crisis.

See also

Uhler slides.pdf

Miguel Couceiro 03/02/2021, 18:00 — 19:00 — Online
, Université de Lorraine

Making ML Models fairer through explanations, feature dropout, and aggregation

Algorithmic decisions are now being used on a daily basis, and based on Machine Learning (ML) processes that may be complex and biased. This raises several concerns given the critical impact that biased decisions may have on individuals or on society as a whole. Not only unfair outcomes affect human rights, they also undermine public trust in ML and AI. In this talk, we will address fairness issues of ML models based on decision outcomes, and we will show how the simple idea of feature dropout followed by an ensemble approach can improve model fairness without compromising its accuracy. To illustrate we will present a general workflow that relies on explainers to tackle process fairness, which essentially measures a model's reliance on sensitive or discriminatory features. We will present different applications and empirical settings that show improvements not only with respect to process fairness but also other fairness metrics.

See also

Couceiro slides.pdf

Xavier Bresson 27/01/2021, 11:00 — 12:00 — Online
, Nanyang Technological University

Benchmarking Graph Neural Networks

Graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs. As the field grows, it becomes critical to identify key architectures and validate new ideas that generalize to larger, more complex datasets. Unfortunately, it has been increasingly difficult to gauge the effectiveness of new models in the absence of a standardized benchmark with consistent experimental settings. In this work, we introduce a reproducible GNN benchmarking framework, with the facility for researchers to add new models conveniently for arbitrary datasets. We demonstrate the usefulness of our framework by presenting a principled investigation into the recent Weisfeiler-Lehman GNNs (WL-GNNs) compared to message passing-based graph convolutional networks (GCNs) for a variety of graph tasks with medium-scale datasets.

See also

Bresson slides.pdf

James Halverson 20/01/2021, 18:00 — 19:00 — Online
, Northeastern University

Neural Networks and Quantum Field Theory

In this talk I will review essentials of quantum field theory (QFT) and demonstrate how the function-space distribution of many neural networks (NNs) shares similar properties. This allows, for instance, computation of correlators of neural network outputs in terms of Feynman diagrams and a direct analogy between non-Gaussian corrections in NN distributions and particle interactions. Some cases yield divergences in perturbation theory, requiring the introduction of regularization and renormalization. Potential advantages of this perspective will be discussed, including a duality between function-space and parameter-space descriptions of neural networks.

See also

Halverson slides.pdf

Anna C. Gilbert 13/01/2021, 18:00 — 19:00 — Online
, Yale University

Metric representations: Algorithms and Geometry

Given a set of distances amongst points, determining what metric representation is most "consistent" with the input distances or the metric that best captures the relevant geometric features of the data is a key step in many machine learning algorithms. In this talk, we focus on 3 specific metric constrained problems, a class of optimization problems with metric constraints: metric nearness (Brickell et al. (2008)), weighted correlation clustering on general graphs (Bansal et al. (2004)), and metric learning (Bellet et al. (2013); Davis et al. (2007)).

Because of the large number of constraints in these problems, however, these and other researchers have been forced to restrict either the kinds of metrics learned or the size of the problem that can be solved. We provide an algorithm, PROJECT AND FORGET, that uses Bregman projections with cutting planes, to solve metric constrained problems with many (possibly exponentially) inequality constraints. We also prove that our algorithm converges to the global optimal solution. Additionally, we show that the optimality error decays asymptotically at an exponential rate. We show that using our method we can solve large problem instances of three types of metric constrained problems, out-performing all state of the art methods with respect to CPU times and problem sizes.

Finally, we discuss the adaptation of PROJECT AND FORGET to specific types of metric constraints, namely tree and hyperbolic metrics.

Sanjeev Arora 06/01/2021, 18:00 — 19:00 — Online
, Computer Science Department, Princeton University

The quest for mathematical understanding of deep learning

Deep learning has transformed Machine Learning and Artificial Intelligence in the past decade. It raises fundamental questions for mathematics and theory of computer science, since it relies upon solving large-scale nonconvex problems via gradient descent and its variants. This talk will be an introduction to mathematical questions raised by deep learning, and some partial understanding obtained in recent years with respect to optimization, generalization, self-supervised learning, privacy etc.

René Vidal 16/12/2020, 18:00 — 19:00 — Online
, Mathematical Institute for Data Science, Johns Hopkins University

From Optimization Algorithms to Dynamical Systems and Back

Recent work has shown that tools from dynamical systems can be used to analyze accelerated optimization algorithms. For example, it has been shown that the continuous limit of Nesterov’s accelerated gradient (NAG) gives an ODE whose convergence rate matches that of NAG for convex, unconstrained, and smooth problems. Conversely, it has been shown that NAG can be obtained as the discretization of an ODE, however since different discretizations lead to different algorithms, the choice of the discretization becomes important. The first part of this talk will extend this type of analysis to convex, constrained and non-smooth problems by using Lyapunov stability theory to analyze continuous limits of the Alternating Direction Method of Multipliers (ADMM). The second part of this talk will show that many existing and new optimization algorithms can be obtained by suitably discretizing a dissipative Hamiltonian. As an example, we will present a new method called Relativistic Gradient Descent (RGD), which empirically outperforms momentum, RMSprop, Adam and AdaGrad on several non-convex problems.

This is joint work with Guilherme França, Daniel Robinson and Jeremias Sulam.

Projecto FCT UIDB/04459/2020.

Samantha Kleinberg 09/12/2020, 18:00 — 19:00 — Online
, Stevens Institute of Technology

Data, Decisions, and You: Making Causality Useful and Usable in a Complex World

The collection of massive observational datasets has led to unprecedented opportunities for causal inference, such as using electronic health records to identify risk factors for disease. However, our ability to understand these complex data sets has not grown the same pace as our ability to collect them. While causal inference has traditionally focused on pairwise relationships between variables, biological systems are highly complex and knowing when events may happen is often as important as knowing whether they will. In the first half of this talk I discuss new methods that allow causal relationships to be reliably inferred from complex observational data, motivated by analysis of intensive care unit and other medical data. Causes are useful because they allow us to take action, but how there is a gap between the output of machine learning and what helps people make decisions. In the second part of this talk I discuss our recent findings in testing just how people fare when using the output of machine learning and how we can go from data to knowledge to decisions.

See also

Kleinberg slides.pdf

Projecto FCT UIDB/04459/2020.

Gitta Kutyniok 02/12/2020, 18:00 — 19:00 — Online
, Mathematical Institute of the University of Munich

Deep Learning meets Physics: Taking the Best out of Both Worlds in Imaging Science

Pure model-based approaches are today often insufficient for solving complex inverse problems in imaging. At the same time, we witness the tremendous success of data-based methodologies, in particular, deep neural networks for such problems. However, pure deep learning approaches often neglect known and valuable information from physics.

In this talk, we will provide an introduction to this problem complex and then discuss a general conceptual approach to inverse problems in imaging, which combines deep learning and physics. This hybrid approach is based on shearlet-based sparse regularization and deep learning and is guided by a microlocal analysis viewpoint to pay particular attention to the singularity structures of the data. Finally, we will present several applications such as tomographic reconstruction and show that our approach outperforms previous methodologies, including methods entirely based on deep learning.

See also

Kutyniok slides.pdf

Projecto FCT UIDB/04459/2020.

Tommaso Dorigo 25/11/2020, 18:00 — 19:00 — Online
, Italian Institute for Nuclear Physics

Dealing with Systematic Uncertainties in HEP Analysis with Machine Learning Methods

I will discuss the impact of nuisance parameters on the effectiveness of supervised classification in high energy physics problems, and techniques that may mitigate or remove their effect in the search for optimal selection criteria and variable transformations. The approaches discussed include nuisance parametrized models, modified or adversary losses, semi supervised learning approaches and inference-aware techniques.

See also

Dorigo slides.pdf

Projecto FCT UIDB/04459/2020.

Carola-Bibiane Schönlieb 20/11/2020, 15:00 — 16:00 — Online
, DAMTP, University of Cambridge

Combining knowledge and data driven methods for solving inverse imaging problems - getting the best from both worlds

Inverse problems in imaging range from tomographic reconstruction (CT, MRI, etc) to image deconvolution, segmentation, and classification, just to name a few. In this talk I will discuss approaches to inverse imaging problems which have both a mathematical modelling (knowledge driven) and a machine learning (data-driven) component. Mathematical modelling is crucial in the presence of ill-posedness, making use of information about the imaging data, for narrowing down the search space. Such an approach results in highly generalizable reconstruction and analysis methods which come with desirable solutions guarantees. Machine learning on the other hand is a powerful tool for customising methods to individual data sets. Highly parametrised models such as deep neural networks in particular, are powerful tools for accurately modelling prior information about solutions. The combination of these two paradigms, getting the best from both of these worlds, is the topic of this talk, furnished with examples for image classification under minimal supervision and for tomographic image reconstruction.

Projecto FCT UIDB/04459/2020.

Bin Dong 11/11/2020, 11:00 — 12:00 — Online
, BICMR, Peking University

Learning and Learning to Solve PDEs

Deep learning continues to dominate machine learning and has been successful in computer vision, natural language processing, etc. Its impact has now expanded to many research areas in science and engineering. In this talk, I will mainly focus on some recent impact of deep learning on computational mathematics. I will present our recent work on bridging deep neural networks with numerical differential equations. On the one hand, I will show how to design transparent deep convolutional networks to uncover hidden PDE models from observed dynamical data. On the other hand, I will present our preliminary attempt to establish a deep reinforcement learning based framework to solve 1D scalar conservation laws, and a meta-learning approach for solving linear parameterized PDEs based on the multigrid method.

See also

Dong slides.pdf

Projecto FCT UIDB/04459/2020.

Older session pages: Previous 2 Oldest

The IST seminar series Mathematics, Physics & Machine Learning aims at bringing together mathematicians and physicists interested in machine learning (ML) with  ML and AI experts interested in mathematics and physics, with the goals of introducing innovative mathematics and physics-inspired techniques in ML and, reciprocally, applying ML to problems in mathematics and physics.

Organizers: Cláudia Nunes (DM and CEMAT), Cláudia Soares (DEEC and ISR), Francisco Melo (DEI and INESC-ID), João Seixas (DF and CEFEMA), João Xavier (DEEC and ISR), José Mourão (DM and CAMGSD), Mário Figueiredo (DEEC and IT), Pedro Alexandre Santos (DM and INESC-ID)  and Yasser Omar (DM and IT).
Zoom password: distributed with email announcements or send an email to the organizers asking for it.