PDEs and Wasserstein Spaces

Seminario de Análisis Matemático y Matemática Aplicada UCM
Nov. 3, 2022

Author
Affiliation

David Gómez-Castro

Universidad Complutense de Madrid
gomezcastro.xyz

Some simple PDEs

Transport equations. Constant velocity

One of the easiest PDEs to solve is, \(t,x \in \mathbb R\) \[ \frac{\partial \rho_t}{\partial t} (x) + a \frac{\partial \rho_t}{\partial x} (x) = 0 \]

It can be solved by characteristics \(\rho_t (X_t(y)) = \rho_0(y)\). Plugging this in

\[ 0 = \frac{\partial }{\partial t} \Big( \rho_t \circ X_t \Big) = \frac{\partial \rho_t}{\partial t} (X_t) + \frac{\partial \rho_t}{\partial x} (X_t) \frac{\partial X_t}{\partial t} \]

It suffices that \(\frac{\partial X_t}{\partial t} = a\). And \(X_0(y) = y\) to meet the initial datum.

Eventually \(X_t(y) = y + at\).

\[ \rho_t(x) = \rho_0 (x - at) \]

Transport equation. Non-divergence form

More generally, if \(x \in \Rd\) \[ \frac{\partial \rho}{\partial t} + v_t(x) \cdot \nabla \rho = 0 \] Still admits solutions by characteristics \(\rho_t \circ X_t = \rho_0\).

If \(v_t\) is Lipschitz in \(x\), the field of characteristics is the unique solution of

\[ \begin{cases} \dfrac{\partial X_t}{\partial t} = v_t(X_t) & t > 0, \\ X_0(y)_ = y. \end{cases} \]

The map is \(X_t: \Rd \to \Rd\) is bijective, since we solve “backwards” in time

\[ \begin{cases} \dfrac{\partial Y_s}{\partial s} = - v_{t-s}(Y_s) & t > 0, \\ Y_0(x) = x. \end{cases} \]

Clearly \(X_t(Y_t(x)) = x\). So \(\rho_t = \rho_0 \circ Y_t\).

Transport equation. Divergence form

\[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]

We can no longer solve by normal characteristics.

But we can use generalised characteristics (Evans, 1998:chap.3).

We write \[ \frac{\partial \rho}{\partial t} + \nabla \rho \cdot v_t + \rho \diver v_t = 0 \]

In this case it suffices \(\rho_t \circ X_t = A_t \rho_0\) with \(A_t \in \R\).

\[ \begin{cases} \dfrac{\partial X_t}{\partial t} = v_t(X_t) & t > 0, \\ X(0,y) = y. \end{cases} \qquad \qquad \begin{cases} \dfrac{\partial A_t}{\partial t} = -\diver v_t(X_t) & t > 0, \\ A_0(y) = 1. \end{cases} \]

Eventually, we can write

\[ \rho_t(X_t(y)) = \rho_0(y) e^{-\int_0^t \diver v_s(X_s(y)) \diff s} \]

Conservation

Let us consider \[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]

When \(d = 1\), it is easy to see from the explicit solution that \[ \int_{X_t(a)}^{X_t(b)} \rho_t(x) \diff x = \int_a^b \rho_0(y) \diff y. \]

For the \(d > 1\), it is easier to compute for any solution that for \(A \subset \Rd\) smooth \[ \frac{\diff}{\diff t} \int_{X_t(A)} \rho_t(x) \diff x = 0 \]

so

\[ \int_{A} \rho_t(x) \diff x = \int_{X_t^{-1}(A)} \rho_0(y) \diff y. \]

Transport of mass by characteristics

Push-forward

Let \(X, Y\) be measure spaces,

\(T: X \to Y\) be a measurable map,

\(\mu \in \mathcal M(X)\) be a measure

The push-forward is the measure \(T_\# \mu = \nu \in \mathcal M(Y)\) such that

\[ \nu (B) = \mu (T^{-1} (B)), \qquad \forall B \subset Y \text{ measurable.} \]

\(\mu(A) = \int_A f(x) dx\)

\(\nu(B) = \int_B g(y) dy\)

Transport equation and push-forward

Let us consider \[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]

We had deduced that

\[ \int_{A} \rho_t(x) \diff x = \int_{X_t^{-1}(A)} \rho_0(y) \diff y. \]

With this notation, let \(\mu_t = \rho_t \diff x\), then

\[ \mu_t = (X_t)_\# \mu_0. \]

Push-forward and integration

If \(f : Y \to \mathbb R\) is a simple function

\(\displaystyle \int_{Y} f(y) \diff T_\# \mu (y) = \sum_{i=1}^N z_i (T_\#\mu) ( f^{-1} (\{ z_i \} ) )\)

\(\displaystyle \phantom{\int_{Y} f(y) \diff T_\# \mu (y)} = \sum_{i=1}^N z_i \mu ( T^{-1} f^{-1} ( \{ z_i \} ) )\)

\(\displaystyle \phantom{\int_{Y} f(y) \diff T_\# \mu (y)} = \sum_{i=1}^N z_i \mu ( (f \circ T)^{-1} ( \{ z_i \} ) )\)

\(\displaystyle \phantom{\int_{Y} f(y) \diff T_\# \mu (y)} = \int_X f (T(x)) \diff \mu (x).\)

In general

\[ \int_Y f(y) \diff T_\# \mu (y) = \int_X f(T(x)) \diff \mu (x). \]

Push-forward and Dirac deltas

We have shown \[ \int_Y f(y) \diff T_\# \mu (y) = \int_X f(T(x)) \diff \mu (x). \]

It is immediate to deduce that \[ T_\# \delta_0 = \delta_{T(0)} \]

In general that \[ T_\# \sum_{i=1}^N a_i \delta_{x_i} = \sum_{i=1}^N a_i \delta_{T(x_i)} \]

Optimal transport and Wasserstein spaces

Optimal transport problem. Monge formulation

Given \(\mu \in \mathcal M(X)\) and \(\nu \in \mathcal M (Y)\),

Set of transports: \[\mathrm{Trans} (\mu, \nu) =\{ T: X \to Y \text{ measurable } \mid \, \nu = T_\# \mu \}\]

Given a “cost of transport” \(c(x,y)\) we look at the optimal transport problem

\[\begin{equation} \label{eq:Monge} \tag{M} \inf_{T \in \mathrm{Trans} (\mu, \nu)} \int_X c(x,T(x)) \, d \mu (x). \end{equation}\]

Clearly \(\mathrm{Trans} (\mu, \nu) = \emptyset\) if \(\mu(X) \ne \nu(Y)\).

We will work over probability distributions \(\mu(X) = 1\) and \(\mu (A) \ge 0\).

The problem of mass splitting:

\[ \mathrm{Trans} \left(\delta_0, \frac 1 2 \delta_0 + \frac 1 2 \delta_1 \right) = \emptyset. \]

Because who is \(T(0)\)?

Optimal transport problem. Kantorovich formulation

Let \(\mu \in \mathcal P (X)\) and \(\nu \in \mathcal P (Y)\)

Instead of working \(T\), we work in the square \(X \times Y\)

\[ \Pi(\mu, \nu) = \{ \pi \in \mathcal P(X \times Y) \mid \qquad \pi(A\times Y) = \mu(A) \text{ and } \pi(X \times B) = \nu(B) \} \]

This set is never empty, \(\mu \otimes \nu \in \Pi (\mu, \nu)\).

\[ (\mu \otimes \nu) (A \times B) = \mu (A) \nu(B). \]

E.g. \(\mu = \delta_{x_0}\) and \(\nu = \tfrac 1 2 \delta_{x_1} + \tfrac 1 2 \delta_{x_2}\)

The Kantorovich problem is \[\begin{equation} \tag{K} \inf_{ \pi \in \Pi (\mu, \nu) } \int_{X \times Y} c(x,y) \diff \pi(x,y). \end{equation}\]

Relation between Monge and Kantorovich formulations

If there is a plan \(T\), then we can take \[\pi_T = (\mathrm{id} \otimes T)_\# \mu.\]

then

\[ \int_{X \times Y} c(x,y) \diff \pi_T (x,y) = \int_{X} c(x,T(x)) \diff \mu (x). \]

Theorem (Pratelli) If \(\mu\) has no atoms and \(c: X\times Y \to [0,\infty)\) is continuous, then \((K) = (M)\). Furthermore, \((K)\) is a \(\min\).

More info: (Villani, 2003), (Villani, 2009), (Ambrosio, Brué & Semola, 2021, Lecture 2)

Theorem (Brenier, Knott-Smith)

If \(X = Y = \Rd\) and \(c(x,y) = |x-y|^2\), \(\mu, \nu \in \mathcal P (\Rd)\), \(\mu \ll \mathcal L^n\) and assumme \[ \int_\Rd |x|^2 \diff \mu(x), \int_\Rd |x|^2 \diff \nu(x) < \infty. \]

Then

  • \((K)\) is achived with a unique minimiser \(\pi\).

  • \((M)\) is achieved with a unique minimiser \(T\).

    Furthermore \(T = \nabla \psi\) with \(\psi : \Rd \to (-\infty,\infty]\) convex, l.s.c., \(\mu\)-a.e. differentiable.

  • The optimal transports in \(\mathrm{Trans} (\mu, \nu)\) and \(\mathrm{Trans} (\nu, \mu)\) are inverses of each other.

We also also have that

  • Let \(\psi : \Rd \to (-\infty,\infty]\) convex, l.s.c., \(\mu\)-a.e. differentiable, and \(|\nabla \psi|^2 \in L^1 (\mu)\).

    Then \(T = \nabla \psi\) is an optimal transport between \(\mu\) and \(T_\# \mu\).

More info: (Ambrosio, Brué & Semola, 2021, Lecture 5)

The Wasserstein space

Let \(1 \le p < \infty\). For \(\mu, \nu \in \mathcal P(\Rd)\) take the Wasserstein distance \[ d_p(\mu, \nu) = \left( \inf_{\pi \in \Pi(\mu, \nu)} \int_{\Rd \times \Rd} |x-y|^p \diff \pi (x,y) \right)^{\frac 1 p} \]

We point out that \[ d_p (\mu, \delta_0)^p = \int_{\Rd} |x|^p \diff \mu (x) \]

We take the Wasserstein space \[ \mathcal P_p (\Rd) = \left \{ \mu \in \mathcal P (\Rd) : \int_\Rd |x|^p \diff \mu (x) < \infty \right \} \]

The set of empirical measures is dense in \(( \mathcal P_p (\Rd) , d_p )\):

\[ \left\{ \sum_{i=1}^N a_i \delta_{x_i} : N \in \mathbb N , a_i \ge 0, x_i \in \Rd, \sum_i a_i = 1 \right\} \]

Relation to the push-forward

Proposition. Let \(X: \mathbb R^d \to \mathbb R^d\), and \(\mu, \nu \in \mathcal P_p (\Rd)\).

\[ d_p (X_\# \mu, X_\# \nu) \le \| \nabla X \|_{L^\infty} d_p (\mu, \nu). \]

Proof

Let \(\pi \in \Pi (\mu, \nu)\)

Define \(\widetilde \pi = (X,X)_\# \pi\). Then \(\widetilde \pi = \Pi ( X_\# \mu, X_\# \nu )\) and

\(\displaystyle \int_{\Rd \times \Rd} |x-y|^p \diff (X,X)_\# \pi (x,y) = \int_{\Rd \times \Rd} |X(x)-X(y)|^p \diff \pi (x,y)\)

\(\displaystyle \phantom{\int_{\Rd \times \Rd} |x-y|^p \diff (X,X)_\# \pi (x,y)}\le\|\nabla X\|_{L^\infty}\int_{\Rd \times \Rd} |x-y|^p \diff \pi (x,y)\)

We take \(\inf\) over \(\widetilde \pi\), then \(\inf\) over \(\pi\). \(\square\)

Back to \(\frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0\)

The solution is given \(\rho_t \diff x = (X_t)_\# (\rho_0 \diff x)\) where

\[ \begin{cases} \dfrac{\partial X_t}{\partial t} = v_t(X_t) & t > 0, \\ X(0,y) = y. \end{cases} \]

Then

\[ \begin{cases} \dfrac{\partial}{\partial t} \left( \nabla X_t \right) = ( \nabla v_t (X_t) ) \cdot \nabla X_t & t > 0, \\ \nabla X_0(y) = \mathrm{I}. \end{cases} \]

so we get a nice estimate

\[ \|\nabla X_t\|_{L^\infty} \le \exp\left ( \int_0^t \|\nabla v_s\|_{L^\infty} \diff s \right) . \]

Well-posedness in Wasserstein space

For \[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]

we get the continuous dependence estimate

\[ d_p \Big(\rho_t \diff x,\overline \rho_t \diff x\Big) \le \exp\left ( \int_0^t \|\nabla v_s\|_{L^\infty} \right) d_p \Big(\rho_0 \diff x, \overline \rho_0 \diff x\Big). \]

Similarly, \(\mu \in \mathcal C([0,T]; \mathcal P_p (\Rd))\):

\(\displaystyle d_p \Big((X_t)_\# \mu_0, (X_s)_\# \mu_0\Big)^p \le \int_{\Rd \times \Rd}|X_t(x) - X_s(x)|^p \diff \mu_0(x)\) \(\displaystyle \phantom{ d_p \Big((X_t)_\# \mu_0, (X_s)_\# \mu_0\Big)^p} \le \int_{\Rd \times \Rd} (1 + |x|)^p \diff \mu_0(x) \left( \sup_{x \in \mathrm{supp} (\mu_0)} \frac{| X_t(x) - X_s(x)|}{1 + |x|}\right)^p\)

Transport equation with measure data

\[ \frac{\partial \mu}{\partial t} + \diver( \mu v_t(x) ) = 0 \]

Continuous dependence estimate for measure data

\[ d_p(\mu_t, \overline \mu_t) \le \exp\left ( \int_0^t \|\nabla v_s\|_{L^\infty} \right) d_p (\mu_0, \overline \mu_0). \]

But if \[ \mu_0 = \sum_{i=1}^N a_i \delta_{y_i} \]

then

\[ (X_t)_\# \mu_0 = \sum_{i=1}^N a_i \delta_{X_t(y_i)} \]

These are called particle systems. We only need to solve finitely many problems!!

The particle method for \(\frac{\partial \mu}{\partial t} + \diver( \mu v_t(x) ) = 0\)

  1. Take \(\mu_0 \in \mathcal P_p (\Rd)\)

  2. Approximate by \(\mu_0^N = \sum_{i=1}^N a_i \delta_{y_i}\)

  3. Solve for \(i = 1, \cdots, N\) the ODEs \[ \begin{cases} \dfrac{\partial x_t^{(i)}}{\partial t} = v_t (x_t^{(i)}) \\ x_0^{(i)} = y_i. \end{cases} \]

  4. Write the approximate solution \[ \mu^N_t = \sum_{i=1}^N a_i \delta_{x_t^{(i)}} \]

Transport equation with measure data. Distributional solutions

We look at \(\mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t}\) where \(\frac{\diff x^{(i)}_t}{\diff t} = v_t(x^{(i)}_t).\)

Integrating agains a test function \(\int_\Rd \varphi_t(x) \diff \mu_t = \sum_{i=1}^N a_i \varphi({x^{(i)}_t})\).

\(\displaystyle\frac{\partial }{\partial t} \int_\Rd \varphi_t(x) \diff \mu_t = \sum_{i=1}^N a_i \frac{\partial \varphi_t}{\partial t} ({x^{(i)}_t}) + \sum_{i=1}^N a_i \nabla \varphi_t ({x^{(i)}_t}) \cdot \frac{\diff x^{(i)}_t}{\diff t}\)

\(\displaystyle \phantom{\frac{\partial }{\partial t} \int_\Rd \varphi_t(x) \diff \mu_t}= \sum_{i=1}^N a_i \left( \frac{\partial \varphi_t}{\partial t} ({x^{(i)}_t}) + \nabla \varphi_t ({x^{(i)}_t}) v_t(x^{(i)}_t) \right)\)

\(\displaystyle \phantom{\frac{\partial }{\partial t} \int_\Rd \varphi_t(x) \diff \mu_t}= \int_\Rd \left( \frac{\partial \varphi_t}{\partial t} (x) + \nabla \varphi_t (x) v_t(x) \right) \diff \mu_t (x)\)

For \(\varphi \in C_c^\infty([0,+\infty) \times \Rd)\) we have

\[ \int_0^\infty \int_\Rd \left( - \frac{\partial \varphi_t}{\partial t} - \nabla \varphi_t v_t(x) \right) \diff \mu_t \diff t = \int_\Rd \varphi_0(x) \diff \mu_0 \]

Interacting particles

The equation \(\frac{\partial \mu}{\partial t} + \diver( \mu v_t(x) ) = 0\) admits solutions with decoupled particles

\[ \mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t} \qquad \text{where } \frac{\diff x^{(i)}_t}{\diff t} = v_t (x^{(i)}_t) \]

People also care about the model interacting particles, e.g.

\[ \mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t} \qquad \text{where } \frac{\diff x^{(i)}_t}{\diff t} = - \sum_{j=1}^N a_j \nabla W (x^{(i)}_t - x^{(j)}_t). \]


We look at \[ \mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t} \qquad \text{ where } \frac{\diff x^{(i)}_t}{\diff t} = - \sum_{j=1}^N a_j \nabla W (x^{(i)}_t - x^{(j)}_t). \]

First notice that \[ \sum_{j=1}^N a_j \nabla W (x^{(i)}_t - x^{(j)}_t) = \int_\Rd \nabla W(x^{(i)}_t - y) \diff \mu_t(y) = \nabla W * \mu_t (x^{(i)}_t) = - v_t(x^{(i)}_t; \mu) \]

As before, for \(\varphi \in C_c^\infty([0,+\infty) \times \Rd)\) we have

\[ \int_0^\infty \int_\Rd \left( - \frac{\partial \varphi_t}{\partial t} + \nabla \varphi_t \nabla W * \mu_t \right) \diff \mu_t \diff t = \int_\Rd \varphi_0(x) \diff \mu_0 \]

Then, distributionally \[\begin{equation} \tag{AE} \frac{\partial}{\partial t} \mu_t = \diver (\mu_t \nabla W * \mu_t) \end{equation}\]

This is a non-local PDE. The solution does not have a natural structure \((X_t)_\# \mu_0\).

Benamou-Brenier formula

We showed that if you like some PDEs, Wasserstein can help.

The reverse is also true. For two given measures \(\mu_0, \mu_1\) we look at all the possible conservation equation liking them

Define \(\mathrm{Conv}(\mu_0, \mu_1)\) as the set of pairs \((\mu, v)\)

  • \(\mu : [0,1] \to \mathcal P_2 (\Rd)\) continuous w.r.t. weak topology
  • \(v : [0,1] \times \Rd \to \Rd\) Borel
  • That are related by \(\partial_t \mu_t + \diver(v_t \mu_t) = 0\)

Then \[ d_2(\mu_0, \mu_1)^2 = \inf_{(\mu,v) \in \mathrm{Conv}(\mu_0, \mu_1)} \int_0^1 \int_\Rd |v_t|^2 (x) \diff \mu_t(x) \diff t \]

See (Ambrosio, Brué & Semola, 2021, Lecture 17)

Connection between the continuity equation and curves

The space \(AC^p([0,1]; M)\) is the space of functions \(\gamma:[0,1] \to M\) such that

there exists \(g \in L^p (0,1)\) such that \[ d(\gamma_y,\gamma_x) \le \int_x^y g(t) \diff t, \qquad \forall 0 \le x \le y \le 1. \]

Theorem Let \(\mu_t \in AC^2([0,1]; \mathcal P_2(\Rd))\). Then, there exists a velocity field \(v_t\) such that \(\mu\) is a solution of \[ \frac{\partial \mu_t}{\partial t} + \diver(v_t \mu_t) = 0. \]

Gradient flows in metric spaces

Gradient flows in \(\Rd\)

Let \(F: \Rd \to \mathbb R\). The gradient flow is \[ \frac{\diff u}{\diff t} = - \nabla F (u) \]

If \(D^2 F \ge \lambda I\) then \[|u(t) - \overline u(t) |\le e^{-\lambda t} |u(0) - \overline u(0)|.\]

If \(F\) is strictly convex, for any \(u(0)\) we have \[u(t) \to u_\infty = \argmin F.\]

Gradient flow of \(F = \frac 1 2 x^2 + \frac 3 2 y^2\)

Gradient flows in Hilbert spaces

Let \(H\) be a Hilbert space, \(\mathcal F: \mathrm{Dom}(\mathcal F) \subset H \to \mathbb R\) Gateaux diffentiable

The dynamical system given by

\[ \frac{\diff u}{\diff t} = - \mathcal F'(u) \]

If \(\mathcal F\) is convex, there is surely a unique minimiser. And there are similar properties to the \(\Rd\) case.

Example. (Heat equation) \(\frac{\partial \rho}{\partial t} = \Delta u\).

      Corresponds to \(H = L^2(\Rd)\) and \(\mathcal F(u) = \int_\Rd |\nabla u|^2\)

Remark. If \(\mathcal F\) is convex, the implicit Euler is convergent (Crandall & Liggett, 1971)

\[ \frac{U_{n+1}^{(\tau)} - U_n^{(\tau)}}{\tau} = - \mathcal F'(U_{n+1}^{(\tau)}) \]

      and each step can be recovered from \(\displaystyle U_{n+1}^{(\tau)} \in \argmin_{x \in H} \left(\frac{\| x - U_n^{(\tau)} \|_H^2 }{2\tau} + \mathcal F(x) \right)\)

Otto’s calculus (Otto, 1996), (Otto, 2001). Formal tangent bundle

We can think formally about the tangent space \(T_{\rho_0} \mathcal P_2 (\Rd)\).

A generic element \(T_{\rho_0} M\) is the derivative of a curve passin by \(\rho_0\).

A generic curve \(t \in [0,1 ] \to \rho_t\) is a solution of the continuity equation in distributional sense \[ \frac{\partial \rho_t}{\partial t} = - \diver( \rho_t \nabla \psi_t ) \]

Formally the derivative at \(0\) \[ s = \frac{\partial}{\partial t} \Bigg|_{ t=0 } \rho_t = - \diver( \rho_0 \nabla \psi_0 ) \]

We can map \(s \in T_{\rho_0} \mathcal P_2 (\Rd)\) to some gradient field \(\nabla \psi_0\).

We can even define the metric tensor

\[ \langle s , \overline s \rangle_{\rho_0} = \int_\Rd \langle \nabla \psi, \nabla \overline \psi \rangle \diff \rho_0. \]

Formal gradient in \(\mathcal P_2\).

Let us do first the example \(\displaystyle \mathcal F[\rho \diff x] = \int_\Rd U(\rho) \diff x.\)

Take \(\xi = \nabla \zeta\) with \(\zeta \in C_c^\infty (\Rd)\). Let \(\rho_\ee := (1_\Rd + \ee \xi )_\# \rho_0 .\)

The map \((x, \ee) \mapsto \rho_\ee (x)\) is \(C^2\) and \(\displaystyle \qquad \lim_{\ee \to 0} \rho_\ee = \rho_0, \qquad \frac{\partial }{\partial \ee }\Big|_{\ee = 0} \rho_\ee = -\diver( \rho \xi ).\)

\[ \lim_{\ee \to 0} \frac{\mathcal F [\rho_\ee] - \mathcal F[\rho_0]}{\ee} = - \int_\Rd U'(\rho_0) \nabla \cdot (\rho_0 \xi) = \int_\Rd \nabla \zeta \nabla U'(\rho_0) \diff \rho_0 \]

In distributional sense, we have that

\[ \nabla_{d_2} \left( \int_\Rd U(\rho) \right) = - \diver \left( \rho \nabla U'(\rho) \right) . \]

Formal gradient in \(\mathcal P_2\).

Similarly, using variation formulae (see (Giaquinta & Hildebrandt, 1996)), for general \(\mathcal F\) we get \[ \lim_{\ee \to 0} \frac{\mathcal F [\rho_\ee] - \mathcal F[\rho_0]}{\ee} = \int_\Rd \nabla \zeta \nabla \frac{\delta \mathcal F}{\delta \rho} [\rho_0] \diff \rho \]

In distributional sense, we have that

\[ \nabla_{d_2} \mathcal F = - \diver \left( \rho \nabla \frac{\delta \mathcal F}{\delta \rho} \right) . \]

More details: (Ambrosio, Gigli & Savare, 2005:sec.10.4.1), (Ambrosio, Brué & Semola, 2021, Lecture 18)

Extension to \(\nabla_{d_p}\) is also available (Otto, 1996), and yields \(p\)-Laplacian flavoured variants.

Formal gradient flows in \(\mathcal P_2\)

Since \(\nabla_{d_2} \mathcal F = - \diver ( \rho \nabla \frac{\delta \mathcal F}{\delta \rho} )\) we have that

\[ \frac{\partial \rho_t}{\partial t} = \diver \left( \rho \nabla \frac{\delta \mathcal F}{\delta \rho} \right) \] is the \(2\)-Wasserstein gradient flow of the energy \(\mathcal F\).

Example. (The Heat Equation) \(\displaystyle \frac{\partial \rho}{\partial t} = \Delta \rho\)

      is the formal \(\mathcal P_2\) gradient flow of the Boltzmann entropy

\[\displaystyle \mathcal F[\rho] = \int_\Rd \rho \log \rho.\]

Example. (The Porous Medium Equation) \(\displaystyle \frac{\partial \rho}{\partial t} = \Delta \rho^m\) corresponds to \(\mathcal F [\mu] = \int_{\Rd} \frac m {m-1} \rho^m\).

Example. (Aggregation Equation) \(\frac{\partial \mu_t}{\partial t} = \diver (\mu_t \nabla W * \mu_t)\) to \(\mathcal F [\mu] = \tfrac 1 2 \int_\Rd (W * \mu) \diff \mu.\)

      For existence and uniqueness of the “correct type” of solutions see (Carrillo et al., 2011).

Formal gradient flows in \(\mathcal P_2\)

Example. (Aggregation-Diffusion Equation)

\[ \frac{\partial \rho}{\partial t} = \diver\left( \rho \nabla \left( U'(\rho) + V + W*\rho \right) \right) \]

      is formally the \(\mathcal P_2\)-gradient flow of the free energy

\[ \mathcal F[\rho] = \int_\Rd \Big( U(\rho) + V\rho + \frac 1 2 \rho (\rho*W) \Big) \diff x. \]

There is a correct notion on “convexity” that reproduces the usual properties

This problem allows for nice relative entropy arguments

Y esta es una bella historia para otro día.

Charla el 25 de noviembre de 2022 en el seminario de la UAM.

Rigurous gradient flows in metric spaces: curves of maximal slope

When \(X\) is a Banach space, \(\displaystyle \frac{\partial \rho}{\partial t} = - \nabla_{X} \mathcal F [\rho(t)]\) in \(X^*\).

The main idea is the equivalence for \(u : [0,T] \to \Rd\) that \[ u'(t) = - \nabla \mathcal F (u), \qquad \iff \qquad \begin{cases} \dfrac{\diff }{\diff t} (\mathcal F \circ u ) = -| \nabla \mathcal F (u) | |u'| & \text{orientation} \\ |u'| = |\nabla \mathcal F (u)| & \text{norm} \end{cases} \]

We define the metric slopes \[ | \mu' | (t)= \limsup_{h \to 0} \frac{ d(\mu(t+h) , \mu(t)) }{h}, \qquad | \partial \mathcal F | [\mu] = \limsup_{\nu \to \mu} \frac{ (\mathcal F [\mu] - \mathcal F[\nu])_+ }{d (\mu, \nu)}\]

Definition (curve of maximal slope)

A locally abs. cont. curve \(t \mapsto \mu (t) \in M\) such that \(t \mapsto \mathcal F[\mu(t)]\) is abs. cont. and \[ \frac 1 2 \int_s^t |\mu'|^2(r) \diff r + \frac 1 2 \int_s^t |\partial \mathcal F|^2 [\mu(r)] \diff r \le \mathcal F [\mu(s)] - \mathcal F [\mu(t)] \qquad \forall 0 \le s < t \le T \]

“Implicit Euler” in metric spaces. The JKO scheme

Let \((M,d)\) be a metric space and \(\mathcal F: M \to \mathbb R\).

The JKO scheme (Jordan, Kinderlehrer & Otto, 1998) extends the implicit Euler scheme in the minimisation form

\[\begin{equation} \tag{JKO} U_{n+1}^{(\tau)} \in \argmin_{x \in M} \left(\frac{d(x,U_n^{(\tau)})^2 }{2\tau} + \mathcal F(x) \right). \end{equation}\]

To be more rigorous, define \[ u_t^{(\tau)} = U_{n}^{(\tau)} \quad \text{for } n \tau \le t < (n+1) \tau. \]

(Ambrosio, Gigli & Savare, 2005) is explains how JKO converges to curves of maximal slope in \(\mathcal P_2\).

Numerics (Carrillo et al., 2019)

Y esta conversación queda para otro día

A mention on the Fields Medals

Cédric Villani won the fields medal in 2010 (see (Yau, 2011))

Alessio Figalli won the Fields in 2018.

  • Thesis 2007: Optimal transportation and action-minimizing measures

    He used this to study the Mongé-Ampere equation.

  • Hardy-Littlewood-Sobolev inequalities

  • obstacle problem (with L. Caffarelli)

  • fractional Laplacian

Beyond characteristics and gradient flows:
the duality approach

Multi-species problems

Consider multiple interaction species indexed by \(k = 1, \cdots ,K\)

and each species with several particles indexed by \(i\).

Between each two species

Then \[ \frac{\diff}{\diff t} x^{(k,i)}_t = - \sum_\ell \sum_{j} a_{j,k} \nabla W_{k,\ell} (x^{(k,i)}_t - x^{(\ell,j)}_t). \]

This gives a system of PDEs for the empirical measures \[ \left\{ \sum_{i=1}^N a_i \delta_{x_i} : N \in \mathbb N , a_i \ge 0, x_i \in \Rd, \sum_i a_i = 1 \right\} \]

These are not gradient flows, except in particular cases (Di Francesco, Esposito & Fagioli, 2018)


Vanishing viscosity approximation

If we introduce Brownian noise to the particles, then it is natural to study \[ \frac{\partial}{\partial t} \mu^{(k)}_t = \ee \Delta \mu^{(k)} + \diver\left(\mu^{(k)} \nabla \sum_{\ell} W_{k,\ell} * \mu^{(\ell)} \right). \]

This is also the vanishing viscosity approximation.

It does not admit characteristics.

In (Carrillo & G-C, 2022) we study well-posedness by duality

(no characteristics and no gradient flow structure)

Duality approach to \(\mathcal P_1\)

The duality characterisation (Villani, 2003, Theorem 1.14)

\[ d_1 (\mu, \nu) = \sup \left\{ \int_\Rd \psi \diff (\mu - v) : \Lip(\psi) \le 1 \right\}. \]

Duality approach to PDEs

We will look at the problems, for \(\ee \ge 0\) \[ \partial_t \mu_t + \diver (\mu v_t(x)) = \ee \Delta \mu_t \]

The distributional formulation \[ \int_\Rd \varphi_T \diff \mu_T - \int_0^T \int_\Rd \left( \frac{\partial \varphi_t}{\partial t} + v_t(x) \nabla \varphi_t + \ee \Delta \varphi_t \right) \diff \mu_t = \int_\Rd \varphi_0 \diff \mu_0. \]

The central term can be cancelled if we take \(\varphi_t = \psi_{T-t}\) given by the adjoint problem \[ \frac{\partial \psi_s}{\partial s} = \nabla \psi_s \cdot v_{T-s}(x) + \ee \Delta \psi_s \]

We get \[ \int_\Rd \psi_0 \diff \mu_T = \int_\Rd \psi_T \diff \mu_0. \]

Duality approach to PDEs. (Carrillo & G-C, 2022)

We will look at the problems, for \(\ee \ge 0\) \[ \partial_t \mu_t + \diver (\mu v_t(x)) = \ee \Delta \mu_t, \qquad \qquad \partial_t \overline \mu_t + \diver (\overline \mu_t \overline v_t(x)) = \ee \Delta \overline \mu_t. \]

We take the test functions \[ \partial_s \psi_s = \nabla \psi_s \cdot v_{T-s}(x) + \ee \Delta \psi_s \qquad \qquad \partial_s \overline \psi_s = \nabla \overline \psi_s \cdot v_{T-s}(x) + \ee \Delta \overline \psi_s \] with \(\psi_0 = \overline \psi_0\)

Then \(\displaystyle \int_\Rd \psi_0 \diff (\mu_T - \overline \mu_T) = \int_\Rd \psi_T \diff \mu_0 - \int_\Rd \overline \psi_T \diff \overline \mu_0\)

Or even better \(\displaystyle \int_\Rd \psi_0 \diff (\mu_T - \overline \mu_T) = \int_\Rd \psi_T \diff (\mu_0 - \overline \mu_0) + \int_\Rd (\psi_T - \overline \psi_T) \diff \overline \mu_0\)

Thus

\[ \begin{aligned} d_1(\mu_t, \overline \mu_t) & \le \underbrace{ d_1 (\mu_0, \overline \mu_0) \sup_{\Lip(\psi_0) \le 1} \| \Lip(\psi_T) \|_{L^\infty} }_{\textrm{Cont. dependence on }\mu_0} + \underbrace{ \left( 1 + \int_\Rd |x| \diff \overline \mu_0 \right) \sup_{\Lip(\psi_0) \le 1} \left \| \frac{\psi_T - \overline \psi_T}{1+|x|} \right \|_{L^\infty} }_{\textrm{Cont. dependence on } v} . \end{aligned} \]

Dual-viscosity formulation (Carrillo & G-C, 2022)

Let us go back to the system \[ \frac{\partial}{\partial t} \mu^{(k)}_t = \ee \Delta \mu^{(k)} + \diver\left(\mu^{(k)} \nabla \sum_{\ell} W_{k,\ell} * \mu^{(\ell)} \right) \]

for simplicity in this presentation \(W_{k,\ell}\) with \(\nabla W_{k,\ell} \in \Lip\).

Definition. (dual-viscosity solution) \(\mu^{(k)} \in C( [0,\infty), \mathcal P_1(\Rd) )\) such that

      For each \(T\ge 0\) \(k\) and \(\psi_0\) with \(\Lip(\psi_0)\) and \(\psi^{(k),T}\) the unique viscosity solution of \[ \frac{\partial}{\partial t} \psi = \ee \Delta \psi + \nabla \psi \cdot \nabla \left( \sum_{\ell} W_{k,\ell} * \mu^{(\ell)} \right) \]

      then \(\displaystyle \int_\Rd \psi_0 \diff \mu_T = \int_\Rd \psi_T \diff \mu_0.\)

Dual-viscosity formulation (Carrillo & G-C, 2022)

Theorem. For all \(\varepsilon \ge 0\) well-posedness of dual-viscosity solutions.

Our framework covers more general settings: \(\nabla \sum_{\ell} W_{k,\ell} * \mu_\ell\) replaced by general \(v_t[\mu]\).

Proof: A priori estimates for \(\psi\) and fixed point argument. \(\square\)

Otra bella historia. Que queda para otro día.

Conclusion

  • The conservation laws \[\begin{equation} \tag{C} \frac{\partial \rho_t}{\partial t} + \diver (\rho_t v_t) = 0 \end{equation}\] are solved as the push-forward through characteristics
  • Wasserstein spaces are natural to study (C)
  • In fact, (C) characterise \(\mathcal P_2\)

    by Benamou-Brenier

  • Many other conservation laws are

    gradient flows in Wasserstein space

Thank you!

Bibliography

References

Ambrosio, L., Brué, E. & Semola, D. (2021) Lectures on optimal transport. Springer International Publishing. doi:10.1007/978-3-030-72162-6.
Ambrosio, L., Gigli, N. & Savare, G. (2005) Gradient Flows. Lectures in mathematics ETH zürich. Basel, Birkhäuser-Verlag. doi:10.1007/b137080.
Carrillo, J.A., Craig, K., Wang, L. & Wei, C. (2019) Primal dual methods for Wasserstein gradient flows. arXiv. https://arxiv.org/abs/1901.08081.
Carrillo, J.A., DiFrancesco, M., Figalli, A., Laurent, T. & Slepčev, D. (2011) Global-in-time weak measure solutions and finite-time aggregation for nonlocal interaction equations. Duke Math. J. 156 (2), 229–271. doi:10.1215/00127094-2010-211.
Carrillo, J.A. & G-C (2022) Interpreting systems of continuity equations in spaces of probability measures through PDE duality. June 2022. https://arxiv.org/abs/2206.03968.
Crandall, M.G. & Liggett, T.M. (1971) Generation of Semi-Groups of Nonlinear Transformations on General Banach Spaces. Am. J. Math. 93 (2), 265. doi:10.2307/2373376.
Desvillettes, L. & Villani, C. (2005) On the trend to global equilibrium for spatially inhomogeneous kinetic systems: The Boltzmann equation. Inventiones mathematicae. 159 (2), 245–316. doi:10.1007/s00222-004-0389-9.
Di Francesco, M., Esposito, A. & Fagioli, S. (2018) Nonlinear degenerate cross-diffusion systems with nonlocal interaction. Nonlinear Anal. Theory, Methods Appl. 169, 94–117. doi:10.1016/j.na.2017.12.003.
Evans, L.C. (1998) Partial Differential Equations. Providence, Rhode Island, American Mathematical Society.
Giaquinta, M. & Hildebrandt, S. (1996) The Lagrangian formalism. Calculus of variations. I. Grundlehren der mathematischen wissenschaften. Springer-Verlag, Berlin.
Jordan, R., Kinderlehrer, D. & Otto, F. (1998) The variational formulation of the Fokker-Planck equation. SIAM J. Math. Anal. 29 (1), 1–17. doi:10.1137/S0036141096303359.
Lott, J. & Villani, C. (2009) Ricci curvature for metric-measure spaces via optimal transport. Annals of Mathematics. 169 (3), 903–991. http://www.jstor.org/stable/25662148.
Mouhot, C. & Villani, C. (2011) On landau damping. Acta Mathematica. 207 (1), 29–201. doi:10.1007/s11511-011-0068-9.
Otto, F. (1996) Double degenerate diffusion equations as steepest descent. 1–43. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.5263&rep=rep1&type=pdf.
Otto, F. (2001) The geometry of dissipative evolution equations: The porous medium equation. Commun. Partial Differ. Equations. 26 (1-2), 101–174. doi:10.1081/PDE-100002243.
Otto, F. & Villani, C. (2000) Generalization of an inequality by talagrand and links with the logarithmic sobolev inequality. Journal of Functional Analysis. 173 (2), 361–400. doi:10.1006/jfan.1999.3557.
Villani, C. (2009) Optimal Transport. Grundlehren der mathematischen wissenschaften. Berlin, Heidelberg, Springer Berlin Heidelberg. doi:10.1007/978-3-540-71050-9.
Villani, C. (2003) Topics in optimal transportation. American Mathematical Society. doi:10.1090/gsm/058.
Yau, H.-T. (2011) The work of Cédric Villani. In: Proceedings of the international congress of mathematicians 2010 (ICM 2010). June 2011 Published by Hindustan Book Agency, India. doi:10.1142/9789814324359_0004.