PDEs and Wasserstein Spaces

Seminario de Análisis Matemático y Matemática Aplicada UCM
Nov. 3, 2022

Author

Affiliation

David Gómez-Castro

Universidad Complutense de Madrid
gomezcastro.xyz

Some simple PDEs

Transport equations. Constant velocity

One of the easiest PDEs to solve is, \(t,x \in \mathbb R\) \[ \frac{\partial \rho_t}{\partial t} (x) + a \frac{\partial \rho_t}{\partial x} (x) = 0 \]

It can be solved by characteristics \(\rho_t (X_t(y)) = \rho_0(y)\). Plugging this in

\[ 0 = \frac{\partial }{\partial t} \Big( \rho_t \circ X_t \Big) = \frac{\partial \rho_t}{\partial t} (X_t) + \frac{\partial \rho_t}{\partial x} (X_t) \frac{\partial X_t}{\partial t} \]

It suffices that \(\frac{\partial X_t}{\partial t} = a\). And \(X_0(y) = y\) to meet the initial datum.

Eventually \(X_t(y) = y + at\).

\[ \rho_t(x) = \rho_0 (x - at) \]

Transport equation. Non-divergence form

More generally, if \(x \in \Rd\) \[ \frac{\partial \rho}{\partial t} + v_t(x) \cdot \nabla \rho = 0 \] Still admits solutions by characteristics \(\rho_t \circ X_t = \rho_0\).

If \(v_t\) is Lipschitz in \(x\), the field of characteristics is the unique solution of

\[ \begin{cases} \dfrac{\partial X_t}{\partial t} = v_t(X_t) & t > 0, \\ X_0(y)_ = y. \end{cases} \]

The map is \(X_t: \Rd \to \Rd\) is bijective, since we solve “backwards” in time

\[ \begin{cases} \dfrac{\partial Y_s}{\partial s} = - v_{t-s}(Y_s) & t > 0, \\ Y_0(x) = x. \end{cases} \]

Clearly \(X_t(Y_t(x)) = x\). So \(\rho_t = \rho_0 \circ Y_t\).

Transport equation. Divergence form

\[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]

We can no longer solve by normal characteristics.

But we can use generalised characteristics (Evans, 1998:chap.3).

We write \[ \frac{\partial \rho}{\partial t} + \nabla \rho \cdot v_t + \rho \diver v_t = 0 \]

In this case it suffices \(\rho_t \circ X_t = A_t \rho_0\) with \(A_t \in \R\).

\[ \begin{cases} \dfrac{\partial X_t}{\partial t} = v_t(X_t) & t > 0, \\ X(0,y) = y. \end{cases} \qquad \qquad \begin{cases} \dfrac{\partial A_t}{\partial t} = -\diver v_t(X_t) & t > 0, \\ A_0(y) = 1. \end{cases} \]

Eventually, we can write

\[ \rho_t(X_t(y)) = \rho_0(y) e^{-\int_0^t \diver v_s(X_s(y)) \diff s} \]

Conservation

Let us consider \[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]

When \(d = 1\), it is easy to see from the explicit solution that \[ \int_{X_t(a)}^{X_t(b)} \rho_t(x) \diff x = \int_a^b \rho_0(y) \diff y. \]

For the \(d > 1\), it is easier to compute for any solution that for \(A \subset \Rd\) smooth \[ \frac{\diff}{\diff t} \int_{X_t(A)} \rho_t(x) \diff x = 0 \]

\[ \int_{A} \rho_t(x) \diff x = \int_{X_t^{-1}(A)} \rho_0(y) \diff y. \]

Transport of mass by characteristics

Push-forward

Let \(X, Y\) be measure spaces,

\(T: X \to Y\) be a measurable map,

\(\mu \in \mathcal M(X)\) be a measure

The push-forward is the measure \(T_\# \mu = \nu \in \mathcal M(Y)\) such that

\[ \nu (B) = \mu (T^{-1} (B)), \qquad \forall B \subset Y \text{ measurable.} \]

Transport equation and push-forward

Let us consider \[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]

We had deduced that

\[ \int_{A} \rho_t(x) \diff x = \int_{X_t^{-1}(A)} \rho_0(y) \diff y. \]

With this notation, let \(\mu_t = \rho_t \diff x\), then

\[ \mu_t = (X_t)_\# \mu_0. \]

Push-forward and integration

If \(f : Y \to \mathbb R\) is a simple function

\(\displaystyle \int_{Y} f(y) \diff T_\# \mu (y) = \sum_{i=1}^N z_i (T_\#\mu) ( f^{-1} (\{ z_i \} ) )\)

\(\displaystyle \phantom{\int_{Y} f(y) \diff T_\# \mu (y)} = \sum_{i=1}^N z_i \mu ( T^{-1} f^{-1} ( \{ z_i \} ) )\)

\(\displaystyle \phantom{\int_{Y} f(y) \diff T_\# \mu (y)} = \sum_{i=1}^N z_i \mu ( (f \circ T)^{-1} ( \{ z_i \} ) )\)

\(\displaystyle \phantom{\int_{Y} f(y) \diff T_\# \mu (y)} = \int_X f (T(x)) \diff \mu (x).\)

In general

\[ \int_Y f(y) \diff T_\# \mu (y) = \int_X f(T(x)) \diff \mu (x). \]

Push-forward and Dirac deltas

We have shown \[ \int_Y f(y) \diff T_\# \mu (y) = \int_X f(T(x)) \diff \mu (x). \]

It is immediate to deduce that \[ T_\# \delta_0 = \delta_{T(0)} \]

In general that \[ T_\# \sum_{i=1}^N a_i \delta_{x_i} = \sum_{i=1}^N a_i \delta_{T(x_i)} \]

Optimal transport and Wasserstein spaces

Optimal transport problem. Monge formulation

Given \(\mu \in \mathcal M(X)\) and \(\nu \in \mathcal M (Y)\),

Set of transports: \[\mathrm{Trans} (\mu, \nu) =\{ T: X \to Y \text{ measurable } \mid \, \nu = T_\# \mu \}\]

Given a “cost of transport” \(c(x,y)\) we look at the optimal transport problem

\[\begin{equation} \label{eq:Monge} \tag{M} \inf_{T \in \mathrm{Trans} (\mu, \nu)} \int_X c(x,T(x)) \, d \mu (x). \end{equation}\]

Clearly \(\mathrm{Trans} (\mu, \nu) = \emptyset\) if \(\mu(X) \ne \nu(Y)\).

We will work over probability distributions \(\mu(X) = 1\) and \(\mu (A) \ge 0\).

The problem of mass splitting:

\[ \mathrm{Trans} \left(\delta_0, \frac 1 2 \delta_0 + \frac 1 2 \delta_1 \right) = \emptyset. \]

Because who is \(T(0)\)?

Optimal transport problem. Kantorovich formulation

Let \(\mu \in \mathcal P (X)\) and \(\nu \in \mathcal P (Y)\)

Instead of working \(T\), we work in the square \(X \times Y\)

\[ \Pi(\mu, \nu) = \{ \pi \in \mathcal P(X \times Y) \mid \qquad \pi(A\times Y) = \mu(A) \text{ and } \pi(X \times B) = \nu(B) \} \]

This set is never empty, \(\mu \otimes \nu \in \Pi (\mu, \nu)\).

\[ (\mu \otimes \nu) (A \times B) = \mu (A) \nu(B). \]

E.g. \(\mu = \delta_{x_0}\) and \(\nu = \tfrac 1 2 \delta_{x_1} + \tfrac 1 2 \delta_{x_2}\)

The Kantorovich problem is \[\begin{equation} \tag{K} \inf_{ \pi \in \Pi (\mu, \nu) } \int_{X \times Y} c(x,y) \diff \pi(x,y). \end{equation}\]

Relation between Monge and Kantorovich formulations

If there is a plan \(T\), then we can take \[\pi_T = (\mathrm{id} \otimes T)_\# \mu.\]

then

\[ \int_{X \times Y} c(x,y) \diff \pi_T (x,y) = \int_{X} c(x,T(x)) \diff \mu (x). \]

Theorem (Pratelli) If \(\mu\) has no atoms and \(c: X\times Y \to [0,\infty)\) is continuous, then \((K) = (M)\). Furthermore, \((K)\) is a \(\min\).

More info: (Villani, 2003), (Villani, 2009), (Ambrosio, Brué & Semola, 2021, Lecture 2)

Theorem (Brenier, Knott-Smith)

If \(X = Y = \Rd\) and \(c(x,y) = |x-y|^2\), \(\mu, \nu \in \mathcal P (\Rd)\), \(\mu \ll \mathcal L^n\) and assumme \[ \int_\Rd |x|^2 \diff \mu(x), \int_\Rd |x|^2 \diff \nu(x) < \infty. \]

Then

\((K)\) is achived with a unique minimiser \(\pi\).
\((M)\) is achieved with a unique minimiser \(T\).

Furthermore \(T = \nabla \psi\) with \(\psi : \Rd \to (-\infty,\infty]\) convex, l.s.c., \(\mu\)-a.e. differentiable.
The optimal transports in \(\mathrm{Trans} (\mu, \nu)\) and \(\mathrm{Trans} (\nu, \mu)\) are inverses of each other.

We also also have that

Let \(\psi : \Rd \to (-\infty,\infty]\) convex, l.s.c., \(\mu\)-a.e. differentiable, and \(|\nabla \psi|^2 \in L^1 (\mu)\).

Then \(T = \nabla \psi\) is an optimal transport between \(\mu\) and \(T_\# \mu\).

More info: (Ambrosio, Brué & Semola, 2021, Lecture 5)

The Wasserstein space

Let \(1 \le p < \infty\). For \(\mu, \nu \in \mathcal P(\Rd)\) take the Wasserstein distance \[ d_p(\mu, \nu) = \left( \inf_{\pi \in \Pi(\mu, \nu)} \int_{\Rd \times \Rd} |x-y|^p \diff \pi (x,y) \right)^{\frac 1 p} \]

We point out that \[ d_p (\mu, \delta_0)^p = \int_{\Rd} |x|^p \diff \mu (x) \]

We take the Wasserstein space \[ \mathcal P_p (\Rd) = \left \{ \mu \in \mathcal P (\Rd) : \int_\Rd |x|^p \diff \mu (x) < \infty \right \} \]

The set of empirical measures is dense in \(( \mathcal P_p (\Rd) , d_p )\):

\[ \left\{ \sum_{i=1}^N a_i \delta_{x_i} : N \in \mathbb N , a_i \ge 0, x_i \in \Rd, \sum_i a_i = 1 \right\} \]

Relation to the push-forward

Proposition. Let \(X: \mathbb R^d \to \mathbb R^d\), and \(\mu, \nu \in \mathcal P_p (\Rd)\).

\[ d_p (X_\# \mu, X_\# \nu) \le \| \nabla X \|_{L^\infty} d_p (\mu, \nu). \]

Proof

Let \(\pi \in \Pi (\mu, \nu)\)

Define \(\widetilde \pi = (X,X)_\# \pi\). Then \(\widetilde \pi = \Pi ( X_\# \mu, X_\# \nu )\) and

\(\displaystyle \int_{\Rd \times \Rd} |x-y|^p \diff (X,X)_\# \pi (x,y) = \int_{\Rd \times \Rd} |X(x)-X(y)|^p \diff \pi (x,y)\)

\(\displaystyle \phantom{\int_{\Rd \times \Rd} |x-y|^p \diff (X,X)_\# \pi (x,y)}\le\|\nabla X\|_{L^\infty}\int_{\Rd \times \Rd} |x-y|^p \diff \pi (x,y)\)

We take \(\inf\) over \(\widetilde \pi\), then \(\inf\) over \(\pi\). \(\square\)

Back to \(\frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0\)

The solution is given \(\rho_t \diff x = (X_t)_\# (\rho_0 \diff x)\) where

\[ \begin{cases} \dfrac{\partial X_t}{\partial t} = v_t(X_t) & t > 0, \\ X(0,y) = y. \end{cases} \]

Then

\[ \begin{cases} \dfrac{\partial}{\partial t} \left( \nabla X_t \right) = ( \nabla v_t (X_t) ) \cdot \nabla X_t & t > 0, \\ \nabla X_0(y) = \mathrm{I}. \end{cases} \]

so we get a nice estimate

\[ \|\nabla X_t\|_{L^\infty} \le \exp\left ( \int_0^t \|\nabla v_s\|_{L^\infty} \diff s \right) . \]

Well-posedness in Wasserstein space

For \[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]

we get the continuous dependence estimate

\[ d_p \Big(\rho_t \diff x,\overline \rho_t \diff x\Big) \le \exp\left ( \int_0^t \|\nabla v_s\|_{L^\infty} \right) d_p \Big(\rho_0 \diff x, \overline \rho_0 \diff x\Big). \]

Similarly, \(\mu \in \mathcal C([0,T]; \mathcal P_p (\Rd))\):

\(\displaystyle d_p \Big((X_t)_\# \mu_0, (X_s)_\# \mu_0\Big)^p \le \int_{\Rd \times \Rd}|X_t(x) - X_s(x)|^p \diff \mu_0(x)\) \(\displaystyle \phantom{ d_p \Big((X_t)_\# \mu_0, (X_s)_\# \mu_0\Big)^p} \le \int_{\Rd \times \Rd} (1 + |x|)^p \diff \mu_0(x) \left( \sup_{x \in \mathrm{supp} (\mu_0)} \frac{| X_t(x) - X_s(x)|}{1 + |x|}\right)^p\)

Transport equation with measure data

\[ \frac{\partial \mu}{\partial t} + \diver( \mu v_t(x) ) = 0 \]

Continuous dependence estimate for measure data

\[ d_p(\mu_t, \overline \mu_t) \le \exp\left ( \int_0^t \|\nabla v_s\|_{L^\infty} \right) d_p (\mu_0, \overline \mu_0). \]

But if \[ \mu_0 = \sum_{i=1}^N a_i \delta_{y_i} \]

then

\[ (X_t)_\# \mu_0 = \sum_{i=1}^N a_i \delta_{X_t(y_i)} \]

These are called particle systems. We only need to solve finitely many problems!!

The particle method for \(\frac{\partial \mu}{\partial t} + \diver( \mu v_t(x) ) = 0\)

Take \(\mu_0 \in \mathcal P_p (\Rd)\)
Approximate by \(\mu_0^N = \sum_{i=1}^N a_i \delta_{y_i}\)
Solve for \(i = 1, \cdots, N\) the ODEs \[ \begin{cases} \dfrac{\partial x_t^{(i)}}{\partial t} = v_t (x_t^{(i)}) \\ x_0^{(i)} = y_i. \end{cases} \]
Write the approximate solution \[ \mu^N_t = \sum_{i=1}^N a_i \delta_{x_t^{(i)}} \]

Transport equation with measure data. Distributional solutions

We look at \(\mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t}\) where \(\frac{\diff x^{(i)}_t}{\diff t} = v_t(x^{(i)}_t).\)

Integrating agains a test function \(\int_\Rd \varphi_t(x) \diff \mu_t = \sum_{i=1}^N a_i \varphi({x^{(i)}_t})\).

\(\displaystyle\frac{\partial }{\partial t} \int_\Rd \varphi_t(x) \diff \mu_t = \sum_{i=1}^N a_i \frac{\partial \varphi_t}{\partial t} ({x^{(i)}_t}) + \sum_{i=1}^N a_i \nabla \varphi_t ({x^{(i)}_t}) \cdot \frac{\diff x^{(i)}_t}{\diff t}\)

\(\displaystyle \phantom{\frac{\partial }{\partial t} \int_\Rd \varphi_t(x) \diff \mu_t}= \sum_{i=1}^N a_i \left( \frac{\partial \varphi_t}{\partial t} ({x^{(i)}_t}) + \nabla \varphi_t ({x^{(i)}_t}) v_t(x^{(i)}_t) \right)\)

\(\displaystyle \phantom{\frac{\partial }{\partial t} \int_\Rd \varphi_t(x) \diff \mu_t}= \int_\Rd \left( \frac{\partial \varphi_t}{\partial t} (x) + \nabla \varphi_t (x) v_t(x) \right) \diff \mu_t (x)\)

For \(\varphi \in C_c^\infty([0,+\infty) \times \Rd)\) we have

\[ \int_0^\infty \int_\Rd \left( - \frac{\partial \varphi_t}{\partial t} - \nabla \varphi_t v_t(x) \right) \diff \mu_t \diff t = \int_\Rd \varphi_0(x) \diff \mu_0 \]

Interacting particles

The equation \(\frac{\partial \mu}{\partial t} + \diver( \mu v_t(x) ) = 0\) admits solutions with decoupled particles

\[ \mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t} \qquad \text{where } \frac{\diff x^{(i)}_t}{\diff t} = v_t (x^{(i)}_t) \]

People also care about the model interacting particles, e.g.

\[ \mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t} \qquad \text{where } \frac{\diff x^{(i)}_t}{\diff t} = - \sum_{j=1}^N a_j \nabla W (x^{(i)}_t - x^{(j)}_t). \]

We look at \[ \mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t} \qquad \text{ where } \frac{\diff x^{(i)}_t}{\diff t} = - \sum_{j=1}^N a_j \nabla W (x^{(i)}_t - x^{(j)}_t). \]

First notice that \[ \sum_{j=1}^N a_j \nabla W (x^{(i)}_t - x^{(j)}_t) = \int_\Rd \nabla W(x^{(i)}_t - y) \diff \mu_t(y) = \nabla W * \mu_t (x^{(i)}_t) = - v_t(x^{(i)}_t; \mu) \]

As before, for \(\varphi \in C_c^\infty([0,+\infty) \times \Rd)\) we have

\[ \int_0^\infty \int_\Rd \left( - \frac{\partial \varphi_t}{\partial t} + \nabla \varphi_t \nabla W * \mu_t \right) \diff \mu_t \diff t = \int_\Rd \varphi_0(x) \diff \mu_0 \]

Then, distributionally \[\begin{equation} \tag{AE} \frac{\partial}{\partial t} \mu_t = \diver (\mu_t \nabla W * \mu_t) \end{equation}\]

This is a non-local PDE. The solution does not have a natural structure \((X_t)_\# \mu_0\).

Benamou-Brenier formula

We showed that if you like some PDEs, Wasserstein can help.

The reverse is also true. For two given measures \(\mu_0, \mu_1\) we look at all the possible conservation equation liking them

Define \(\mathrm{Conv}(\mu_0, \mu_1)\) as the set of pairs \((\mu, v)\)

\(\mu : [0,1] \to \mathcal P_2 (\Rd)\) continuous w.r.t. weak topology
\(v : [0,1] \times \Rd \to \Rd\) Borel
That are related by \(\partial_t \mu_t + \diver(v_t \mu_t) = 0\)

Then \[ d_2(\mu_0, \mu_1)^2 = \inf_{(\mu,v) \in \mathrm{Conv}(\mu_0, \mu_1)} \int_0^1 \int_\Rd |v_t|^2 (x) \diff \mu_t(x) \diff t \]

See (Ambrosio, Brué & Semola, 2021, Lecture 17)

Connection between the continuity equation and curves

The space \(AC^p([0,1]; M)\) is the space of functions \(\gamma:[0,1] \to M\) such that

there exists \(g \in L^p (0,1)\) such that \[ d(\gamma_y,\gamma_x) \le \int_x^y g(t) \diff t, \qquad \forall 0 \le x \le y \le 1. \]

Theorem Let \(\mu_t \in AC^2([0,1]; \mathcal P_2(\Rd))\). Then, there exists a velocity field \(v_t\) such that \(\mu\) is a solution of \[ \frac{\partial \mu_t}{\partial t} + \diver(v_t \mu_t) = 0. \]

Gradient flows in metric spaces

Gradient flows in \(\Rd\)

Let \(F: \Rd \to \mathbb R\). The gradient flow is \[ \frac{\diff u}{\diff t} = - \nabla F (u) \]

If \(D^2 F \ge \lambda I\) then \[|u(t) - \overline u(t) |\le e^{-\lambda t} |u(0) - \overline u(0)|.\]

If \(F\) is strictly convex, for any \(u(0)\) we have \[u(t) \to u_\infty = \argmin F.\]

Gradient flow of \(F = \frac 1 2 x^2 + \frac 3 2 y^2\)

Gradient flows in Hilbert spaces

Let \(H\) be a Hilbert space, \(\mathcal F: \mathrm{Dom}(\mathcal F) \subset H \to \mathbb R\) Gateaux diffentiable

The dynamical system given by

\[ \frac{\diff u}{\diff t} = - \mathcal F'(u) \]

If \(\mathcal F\) is convex, there is surely a unique minimiser. And there are similar properties to the \(\Rd\) case.

Example. (Heat equation) \(\frac{\partial \rho}{\partial t} = \Delta u\).

Corresponds to \(H = L^2(\Rd)\) and \(\mathcal F(u) = \int_\Rd |\nabla u|^2\)

Remark. If \(\mathcal F\) is convex, the implicit Euler is convergent (Crandall & Liggett, 1971)

\[ \frac{U_{n+1}^{(\tau)} - U_n^{(\tau)}}{\tau} = - \mathcal F'(U_{n+1}^{(\tau)}) \]

and each step can be recovered from \(\displaystyle U_{n+1}^{(\tau)} \in \argmin_{x \in H} \left(\frac{\| x - U_n^{(\tau)} \|_H^2 }{2\tau} + \mathcal F(x) \right)\)

Otto’s calculus (Otto, 1996), (Otto, 2001). Formal tangent bundle

We can think formally about the tangent space \(T_{\rho_0} \mathcal P_2 (\Rd)\).

A generic element \(T_{\rho_0} M\) is the derivative of a curve passin by \(\rho_0\).

A generic curve \(t \in [0,1 ] \to \rho_t\) is a solution of the continuity equation in distributional sense \[ \frac{\partial \rho_t}{\partial t} = - \diver( \rho_t \nabla \psi_t ) \]

Formally the derivative at \(0\) \[ s = \frac{\partial}{\partial t} \Bigg|_{ t=0 } \rho_t = - \diver( \rho_0 \nabla \psi_0 ) \]

We can map \(s \in T_{\rho_0} \mathcal P_2 (\Rd)\) to some gradient field \(\nabla \psi_0\).

We can even define the metric tensor

\[ \langle s , \overline s \rangle_{\rho_0} = \int_\Rd \langle \nabla \psi, \nabla \overline \psi \rangle \diff \rho_0. \]

Formal gradient in \(\mathcal P_2\).

Let us do first the example \(\displaystyle \mathcal F[\rho \diff x] = \int_\Rd U(\rho) \diff x.\)

Take \(\xi = \nabla \zeta\) with \(\zeta \in C_c^\infty (\Rd)\). Let \(\rho_\ee := (1_\Rd + \ee \xi )_\# \rho_0 .\)

The map \((x, \ee) \mapsto \rho_\ee (x)\) is \(C^2\) and \(\displaystyle \qquad \lim_{\ee \to 0} \rho_\ee = \rho_0, \qquad \frac{\partial }{\partial \ee }\Big|_{\ee = 0} \rho_\ee = -\diver( \rho \xi ).\)

\[ \lim_{\ee \to 0} \frac{\mathcal F [\rho_\ee] - \mathcal F[\rho_0]}{\ee} = - \int_\Rd U'(\rho_0) \nabla \cdot (\rho_0 \xi) = \int_\Rd \nabla \zeta \nabla U'(\rho_0) \diff \rho_0 \]

In distributional sense, we have that

\[ \nabla_{d_2} \left( \int_\Rd U(\rho) \right) = - \diver \left( \rho \nabla U'(\rho) \right) . \]

Formal gradient in \(\mathcal P_2\).

Similarly, using variation formulae (see (Giaquinta & Hildebrandt, 1996)), for general \(\mathcal F\) we get \[ \lim_{\ee \to 0} \frac{\mathcal F [\rho_\ee] - \mathcal F[\rho_0]}{\ee} = \int_\Rd \nabla \zeta \nabla \frac{\delta \mathcal F}{\delta \rho} [\rho_0] \diff \rho \]

In distributional sense, we have that

\[ \nabla_{d_2} \mathcal F = - \diver \left( \rho \nabla \frac{\delta \mathcal F}{\delta \rho} \right) . \]

More details: (Ambrosio, Gigli & Savare, 2005:sec.10.4.1), (Ambrosio, Brué & Semola, 2021, Lecture 18)

Extension to \(\nabla_{d_p}\) is also available (Otto, 1996), and yields \(p\)-Laplacian flavoured variants.

Formal gradient flows in \(\mathcal P_2\)

Since \(\nabla_{d_2} \mathcal F = - \diver ( \rho \nabla \frac{\delta \mathcal F}{\delta \rho} )\) we have that

\[ \frac{\partial \rho_t}{\partial t} = \diver \left( \rho \nabla \frac{\delta \mathcal F}{\delta \rho} \right) \] is the \(2\)-Wasserstein gradient flow of the energy \(\mathcal F\).

Example. (The Heat Equation) \(\displaystyle \frac{\partial \rho}{\partial t} = \Delta \rho\)

is the formal \(\mathcal P_2\) gradient flow of the Boltzmann entropy

\[\displaystyle \mathcal F[\rho] = \int_\Rd \rho \log \rho.\]

Example. (The Porous Medium Equation) \(\displaystyle \frac{\partial \rho}{\partial t} = \Delta \rho^m\) corresponds to \(\mathcal F [\mu] = \int_{\Rd} \frac m {m-1} \rho^m\).

Example. (Aggregation Equation) \(\frac{\partial \mu_t}{\partial t} = \diver (\mu_t \nabla W * \mu_t)\) to \(\mathcal F [\mu] = \tfrac 1 2 \int_\Rd (W * \mu) \diff \mu.\)

For existence and uniqueness of the “correct type” of solutions see (Carrillo et al., 2011).

Formal gradient flows in \(\mathcal P_2\)

Example. (Aggregation-Diffusion Equation)

\[ \frac{\partial \rho}{\partial t} = \diver\left( \rho \nabla \left( U'(\rho) + V + W*\rho \right) \right) \]

is formally the \(\mathcal P_2\)-gradient flow of the free energy

\[ \mathcal F[\rho] = \int_\Rd \Big( U(\rho) + V\rho + \frac 1 2 \rho (\rho*W) \Big) \diff x. \]

There is a correct notion on “convexity” that reproduces the usual properties

This problem allows for nice relative entropy arguments

Y esta es una bella historia para otro día.

Charla el 25 de noviembre de 2022 en el seminario de la UAM.

Rigurous gradient flows in metric spaces: curves of maximal slope

When \(X\) is a Banach space, \(\displaystyle \frac{\partial \rho}{\partial t} = - \nabla_{X} \mathcal F [\rho(t)]\) in \(X^*\).

The main idea is the equivalence for \(u : [0,T] \to \Rd\) that \[ u'(t) = - \nabla \mathcal F (u), \qquad \iff \qquad \begin{cases} \dfrac{\diff }{\diff t} (\mathcal F \circ u ) = -| \nabla \mathcal F (u) | |u'| & \text{orientation} \\ |u'| = |\nabla \mathcal F (u)| & \text{norm} \end{cases} \]

We define the metric slopes \[ | \mu' | (t)= \limsup_{h \to 0} \frac{ d(\mu(t+h) , \mu(t)) }{h}, \qquad | \partial \mathcal F | [\mu] = \limsup_{\nu \to \mu} \frac{ (\mathcal F [\mu] - \mathcal F[\nu])_+ }{d (\mu, \nu)}\]

Definition (curve of maximal slope)

A locally abs. cont. curve \(t \mapsto \mu (t) \in M\) such that \(t \mapsto \mathcal F[\mu(t)]\) is abs. cont. and \[ \frac 1 2 \int_s^t |\mu'|^2(r) \diff r + \frac 1 2 \int_s^t |\partial \mathcal F|^2 [\mu(r)] \diff r \le \mathcal F [\mu(s)] - \mathcal F [\mu(t)] \qquad \forall 0 \le s < t \le T \]

“Implicit Euler” in metric spaces. The JKO scheme

Let \((M,d)\) be a metric space and \(\mathcal F: M \to \mathbb R\).

The JKO scheme (Jordan, Kinderlehrer & Otto, 1998) extends the implicit Euler scheme in the minimisation form

\[\begin{equation} \tag{JKO} U_{n+1}^{(\tau)} \in \argmin_{x \in M} \left(\frac{d(x,U_n^{(\tau)})^2 }{2\tau} + \mathcal F(x) \right). \end{equation}\]

To be more rigorous, define \[ u_t^{(\tau)} = U_{n}^{(\tau)} \quad \text{for } n \tau \le t < (n+1) \tau. \]

(Ambrosio, Gigli & Savare, 2005) is explains how JKO converges to curves of maximal slope in \(\mathcal P_2\).

Numerics (Carrillo et al., 2019)

Y esta conversación queda para otro día

A mention on the Fields Medals

Cédric Villani won the fields medal in 2010 (see (Yau, 2011))

the Boltzmann and Landau equations:

relative entropy arguments

(Desvillettes & Villani, 2005), (Mouhot & Villani, 2011)
Ricci flow as a gradient flow:

(Otto & Villani, 2000) and (Lott & Villani, 2009).

Alessio Figalli won the Fields in 2018.

Thesis 2007: Optimal transportation and action-minimizing measures

He used this to study the Mongé-Ampere equation.
Hardy-Littlewood-Sobolev inequalities
obstacle problem (with L. Caffarelli)
fractional Laplacian

Beyond characteristics and gradient flows:
the duality approach

Multi-species problems

Consider multiple interaction species indexed by \(k = 1, \cdots ,K\)

and each species with several particles indexed by \(i\).

Between each two species

Then \[ \frac{\diff}{\diff t} x^{(k,i)}_t = - \sum_\ell \sum_{j} a_{j,k} \nabla W_{k,\ell} (x^{(k,i)}_t - x^{(\ell,j)}_t). \]

This gives a system of PDEs for the empirical measures \[ \left\{ \sum_{i=1}^N a_i \delta_{x_i} : N \in \mathbb N , a_i \ge 0, x_i \in \Rd, \sum_i a_i = 1 \right\} \]

These are not gradient flows, except in particular cases (Di Francesco, Esposito & Fagioli, 2018)

Vanishing viscosity approximation

If we introduce Brownian noise to the particles, then it is natural to study \[ \frac{\partial}{\partial t} \mu^{(k)}_t = \ee \Delta \mu^{(k)} + \diver\left(\mu^{(k)} \nabla \sum_{\ell} W_{k,\ell} * \mu^{(\ell)} \right). \]

This is also the vanishing viscosity approximation.

It does not admit characteristics.

In (Carrillo & G-C, 2022) we study well-posedness by duality

(no characteristics and no gradient flow structure)

Duality approach to \(\mathcal P_1\)

The duality characterisation (Villani, 2003, Theorem 1.14)

\[ d_1 (\mu, \nu) = \sup \left\{ \int_\Rd \psi \diff (\mu - v) : \Lip(\psi) \le 1 \right\}. \]

Duality approach to PDEs

We will look at the problems, for \(\ee \ge 0\) \[ \partial_t \mu_t + \diver (\mu v_t(x)) = \ee \Delta \mu_t \]

The distributional formulation \[ \int_\Rd \varphi_T \diff \mu_T - \int_0^T \int_\Rd \left( \frac{\partial \varphi_t}{\partial t} + v_t(x) \nabla \varphi_t + \ee \Delta \varphi_t \right) \diff \mu_t = \int_\Rd \varphi_0 \diff \mu_0. \]

The central term can be cancelled if we take \(\varphi_t = \psi_{T-t}\) given by the adjoint problem \[ \frac{\partial \psi_s}{\partial s} = \nabla \psi_s \cdot v_{T-s}(x) + \ee \Delta \psi_s \]

We get \[ \int_\Rd \psi_0 \diff \mu_T = \int_\Rd \psi_T \diff \mu_0. \]

Duality approach to PDEs. (Carrillo & G-C, 2022)

We will look at the problems, for \(\ee \ge 0\) \[ \partial_t \mu_t + \diver (\mu v_t(x)) = \ee \Delta \mu_t, \qquad \qquad \partial_t \overline \mu_t + \diver (\overline \mu_t \overline v_t(x)) = \ee \Delta \overline \mu_t. \]

We take the test functions \[ \partial_s \psi_s = \nabla \psi_s \cdot v_{T-s}(x) + \ee \Delta \psi_s \qquad \qquad \partial_s \overline \psi_s = \nabla \overline \psi_s \cdot v_{T-s}(x) + \ee \Delta \overline \psi_s \] with \(\psi_0 = \overline \psi_0\)

Then \(\displaystyle \int_\Rd \psi_0 \diff (\mu_T - \overline \mu_T) = \int_\Rd \psi_T \diff \mu_0 - \int_\Rd \overline \psi_T \diff \overline \mu_0\)

Or even better \(\displaystyle \int_\Rd \psi_0 \diff (\mu_T - \overline \mu_T) = \int_\Rd \psi_T \diff (\mu_0 - \overline \mu_0) + \int_\Rd (\psi_T - \overline \psi_T) \diff \overline \mu_0\)

Thus

\[ \begin{aligned} d_1(\mu_t, \overline \mu_t) & \le \underbrace{ d_1 (\mu_0, \overline \mu_0) \sup_{\Lip(\psi_0) \le 1} \| \Lip(\psi_T) \|_{L^\infty} }_{\textrm{Cont. dependence on }\mu_0} + \underbrace{ \left( 1 + \int_\Rd |x| \diff \overline \mu_0 \right) \sup_{\Lip(\psi_0) \le 1} \left \| \frac{\psi_T - \overline \psi_T}{1+|x|} \right \|_{L^\infty} }_{\textrm{Cont. dependence on } v} . \end{aligned} \]

Dual-viscosity formulation (Carrillo & G-C, 2022)

Let us go back to the system \[ \frac{\partial}{\partial t} \mu^{(k)}_t = \ee \Delta \mu^{(k)} + \diver\left(\mu^{(k)} \nabla \sum_{\ell} W_{k,\ell} * \mu^{(\ell)} \right) \]

for simplicity in this presentation \(W_{k,\ell}\) with \(\nabla W_{k,\ell} \in \Lip\).

Definition. (dual-viscosity solution) \(\mu^{(k)} \in C( [0,\infty), \mathcal P_1(\Rd) )\) such that

For each \(T\ge 0\) \(k\) and \(\psi_0\) with \(\Lip(\psi_0)\) and \(\psi^{(k),T}\) the unique viscosity solution of \[ \frac{\partial}{\partial t} \psi = \ee \Delta \psi + \nabla \psi \cdot \nabla \left( \sum_{\ell} W_{k,\ell} * \mu^{(\ell)} \right) \]

then \(\displaystyle \int_\Rd \psi_0 \diff \mu_T = \int_\Rd \psi_T \diff \mu_0.\)

Dual-viscosity formulation (Carrillo & G-C, 2022)

Theorem. For all \(\varepsilon \ge 0\) well-posedness of dual-viscosity solutions.

Our framework covers more general settings: \(\nabla \sum_{\ell} W_{k,\ell} * \mu_\ell\) replaced by general \(v_t[\mu]\).

Proof: A priori estimates for \(\psi\) and fixed point argument. \(\square\)

Otra bella historia. Que queda para otro día.

Conclusion

The conservation laws \[\begin{equation} \tag{C} \frac{\partial \rho_t}{\partial t} + \diver (\rho_t v_t) = 0 \end{equation}\] are solved as the push-forward through characteristics

Wasserstein spaces are natural to study (C)

In fact, (C) characterise \(\mathcal P_2\)

by Benamou-Brenier

Many other conservation laws are

gradient flows in Wasserstein space

(Carrillo & G-C, 2022) introduces a new approach beyond push-forward and gradient flows

Thank you!

Bibliography

References

Ambrosio, L., Brué, E. & Semola, D. (2021) Lectures on optimal transport. Springer International Publishing. doi:10.1007/978-3-030-72162-6.

Ambrosio, L., Gigli, N. & Savare, G. (2005) Gradient Flows. Lectures in mathematics ETH zürich. Basel, Birkhäuser-Verlag. doi:10.1007/b137080.

Carrillo, J.A., Craig, K., Wang, L. & Wei, C. (2019) Primal dual methods for Wasserstein gradient flows. arXiv. https://arxiv.org/abs/1901.08081.

Carrillo, J.A., DiFrancesco, M., Figalli, A., Laurent, T. & Slepčev, D. (2011) Global-in-time weak measure solutions and finite-time aggregation for nonlocal interaction equations. Duke Math. J. 156 (2), 229–271. doi:10.1215/00127094-2010-211.

Carrillo, J.A. & G-C (2022) Interpreting systems of continuity equations in spaces of probability measures through PDE duality. June 2022. https://arxiv.org/abs/2206.03968.

Crandall, M.G. & Liggett, T.M. (1971) Generation of Semi-Groups of Nonlinear Transformations on General Banach Spaces. Am. J. Math. 93 (2), 265. doi:10.2307/2373376.

Desvillettes, L. & Villani, C. (2005) On the trend to global equilibrium for spatially inhomogeneous kinetic systems: The Boltzmann equation. Inventiones mathematicae. 159 (2), 245–316. doi:10.1007/s00222-004-0389-9.

Di Francesco, M., Esposito, A. & Fagioli, S. (2018) Nonlinear degenerate cross-diffusion systems with nonlocal interaction. Nonlinear Anal. Theory, Methods Appl. 169, 94–117. doi:10.1016/j.na.2017.12.003.

Evans, L.C. (1998) Partial Differential Equations. Providence, Rhode Island, American Mathematical Society.

Giaquinta, M. & Hildebrandt, S. (1996) The Lagrangian formalism. Calculus of variations. I. Grundlehren der mathematischen wissenschaften. Springer-Verlag, Berlin.

Jordan, R., Kinderlehrer, D. & Otto, F. (1998) The variational formulation of the Fokker-Planck equation. SIAM J. Math. Anal. 29 (1), 1–17. doi:10.1137/S0036141096303359.

Lott, J. & Villani, C. (2009) Ricci curvature for metric-measure spaces via optimal transport. Annals of Mathematics. 169 (3), 903–991. http://www.jstor.org/stable/25662148.

Mouhot, C. & Villani, C. (2011) On landau damping. Acta Mathematica. 207 (1), 29–201. doi:10.1007/s11511-011-0068-9.

Otto, F. (1996) Double degenerate diffusion equations as steepest descent. 1–43. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.5263&rep=rep1&type=pdf.

Otto, F. (2001) The geometry of dissipative evolution equations: The porous medium equation. Commun. Partial Differ. Equations. 26 (1-2), 101–174. doi:10.1081/PDE-100002243.

Otto, F. & Villani, C. (2000) Generalization of an inequality by talagrand and links with the logarithmic sobolev inequality. Journal of Functional Analysis. 173 (2), 361–400. doi:10.1006/jfan.1999.3557.

Villani, C. (2009) Optimal Transport. Grundlehren der mathematischen wissenschaften. Berlin, Heidelberg, Springer Berlin Heidelberg. doi:10.1007/978-3-540-71050-9.

Villani, C. (2003) Topics in optimal transportation. American Mathematical Society. doi:10.1090/gsm/058.

Yau, H.-T. (2011) The work of Cédric Villani. In: Proceedings of the international congress of mathematicians 2010 (ICM 2010). June 2011 Published by Hindustan Book Agency, India. doi:10.1142/9789814324359_0004.

Some simple PDEs

Transport equations. Constant velocity

Transport equation. Non-divergence form

Transport equation. Divergence form

Conservation

Transport of mass by characteristics

Push-forward

Transport equation and push-forward

Push-forward and integration

Push-forward and Dirac deltas

Optimal transport and Wasserstein spaces

Optimal transport problem. Monge formulation

Optimal transport problem. Kantorovich formulation

Relation between Monge and Kantorovich formulations

Theorem (Brenier, Knott-Smith)

The Wasserstein space

Relation to the push-forward

Back to \(\frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0\)

Well-posedness in Wasserstein space

Transport equation with measure data

The particle method for \(\frac{\partial \mu}{\partial t} + \diver( \mu v_t(x) ) = 0\)

Transport equation with measure data. Distributional solutions

Interacting particles

Benamou-Brenier formula

Connection between the continuity equation and curves

Gradient flows in metric spaces

Gradient flows in \(\Rd\)

Gradient flows in Hilbert spaces

Otto’s calculus (Otto, 1996), (Otto, 2001). Formal tangent bundle

Formal gradient in \(\mathcal P_2\).

Formal gradient in \(\mathcal P_2\).

Formal gradient flows in \(\mathcal P_2\)

Formal gradient flows in \(\mathcal P_2\)

Rigurous gradient flows in metric spaces: curves of maximal slope

“Implicit Euler” in metric spaces. The JKO scheme

A mention on the Fields Medals

Beyond characteristics and gradient flows: the duality approach

Multi-species problems

Vanishing viscosity approximation

Duality approach to \(\mathcal P_1\)

Duality approach to PDEs

Duality approach to PDEs. (Carrillo & G-C, 2022)

Dual-viscosity formulation (Carrillo & G-C, 2022)

Dual-viscosity formulation (Carrillo & G-C, 2022)

Conclusion

Thank you!

Bibliography

References

Beyond characteristics and gradient flows:
the duality approach