PDEs and Wasserstein Spaces
Seminario de Análisis Matemático y Matemática Aplicada UCM
Nov. 3, 2022
Some simple PDEs
Transport equations. Constant velocity
\[ \DeclareMathOperator{\diver}{div} \newcommand{\R}{{\mathbb R}} \newcommand{\Rd}{{\mathbb R^d}} \newcommand{\diff}{\,\mathrm{d}} \DeclareMathOperator*{\argmin}{argmin} \newcommand{\ee}{\varepsilon} \DeclareMathOperator{\Lip}{Lip} \]
One of the easiest PDEs to solve is, \(t,x \in \mathbb R\) \[ \frac{\partial \rho_t}{\partial t} (x) + a \frac{\partial \rho_t}{\partial x} (x) = 0 \]
It can be solved by characteristics \(\rho_t (X_t(y)) = \rho_0(y)\). Plugging this in
\[ 0 = \frac{\partial }{\partial t} \Big( \rho_t \circ X_t \Big) = \frac{\partial \rho_t}{\partial t} (X_t) + \frac{\partial \rho_t}{\partial x} (X_t) \frac{\partial X_t}{\partial t} \]
It suffices that \(\frac{\partial X_t}{\partial t} = a\). And \(X_0(y) = y\) to meet the initial datum.
Eventually \(X_t(y) = y + at\).
\[ \rho_t(x) = \rho_0 (x - at) \]
Transport equation. Non-divergence form
More generally, if \(x \in \Rd\) \[ \frac{\partial \rho}{\partial t} + v_t(x) \cdot \nabla \rho = 0 \] Still admits solutions by characteristics \(\rho_t \circ X_t = \rho_0\).
If \(v_t\) is Lipschitz in \(x\), the field of characteristics is the unique solution of
\[ \begin{cases} \dfrac{\partial X_t}{\partial t} = v_t(X_t) & t > 0, \\ X_0(y)_ = y. \end{cases} \]
The map is \(X_t: \Rd \to \Rd\) is bijective, since we solve “backwards” in time
\[ \begin{cases} \dfrac{\partial Y_s}{\partial s} = - v_{t-s}(Y_s) & t > 0, \\ Y_0(x) = x. \end{cases} \]
Clearly \(X_t(Y_t(x)) = x\). So \(\rho_t = \rho_0 \circ Y_t\).
Transport equation. Divergence form
\[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]
We can no longer solve by normal characteristics.
But we can use generalised characteristics (Evans, 1998:chap.3).
We write \[ \frac{\partial \rho}{\partial t} + \nabla \rho \cdot v_t + \rho \diver v_t = 0 \]
In this case it suffices \(\rho_t \circ X_t = A_t \rho_0\) with \(A_t \in \R\).
\[ \begin{cases} \dfrac{\partial X_t}{\partial t} = v_t(X_t) & t > 0, \\ X(0,y) = y. \end{cases} \qquad \qquad \begin{cases} \dfrac{\partial A_t}{\partial t} = -\diver v_t(X_t) & t > 0, \\ A_0(y) = 1. \end{cases} \]
Eventually, we can write
\[ \rho_t(X_t(y)) = \rho_0(y) e^{-\int_0^t \diver v_s(X_s(y)) \diff s} \]
Conservation
Let us consider \[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]
When \(d = 1\), it is easy to see from the explicit solution that \[ \int_{X_t(a)}^{X_t(b)} \rho_t(x) \diff x = \int_a^b \rho_0(y) \diff y. \]
For the \(d > 1\), it is easier to compute for any solution that for \(A \subset \Rd\) smooth \[ \frac{\diff}{\diff t} \int_{X_t(A)} \rho_t(x) \diff x = 0 \]
so
\[ \int_{A} \rho_t(x) \diff x = \int_{X_t^{-1}(A)} \rho_0(y) \diff y. \]
Transport of mass by characteristics
Push-forward
Let \(X, Y\) be measure spaces,
\(T: X \to Y\) be a measurable map,
\(\mu \in \mathcal M(X)\) be a measure
The push-forward is the measure \(T_\# \mu = \nu \in \mathcal M(Y)\) such that
\[ \nu (B) = \mu (T^{-1} (B)), \qquad \forall B \subset Y \text{ measurable.} \]
Transport equation and push-forward
Let us consider \[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]
We had deduced that
\[ \int_{A} \rho_t(x) \diff x = \int_{X_t^{-1}(A)} \rho_0(y) \diff y. \]
With this notation, let \(\mu_t = \rho_t \diff x\), then
\[ \mu_t = (X_t)_\# \mu_0. \]
Push-forward and integration
If \(f : Y \to \mathbb R\) is a simple function
\(\displaystyle \int_{Y} f(y) \diff T_\# \mu (y) = \sum_{i=1}^N z_i (T_\#\mu) ( f^{-1} (\{ z_i \} ) )\)
\(\displaystyle \phantom{\int_{Y} f(y) \diff T_\# \mu (y)} = \sum_{i=1}^N z_i \mu ( T^{-1} f^{-1} ( \{ z_i \} ) )\)
\(\displaystyle \phantom{\int_{Y} f(y) \diff T_\# \mu (y)} = \sum_{i=1}^N z_i \mu ( (f \circ T)^{-1} ( \{ z_i \} ) )\)
\(\displaystyle \phantom{\int_{Y} f(y) \diff T_\# \mu (y)} = \int_X f (T(x)) \diff \mu (x).\)
In general
\[ \int_Y f(y) \diff T_\# \mu (y) = \int_X f(T(x)) \diff \mu (x). \]
Push-forward and Dirac deltas
We have shown \[ \int_Y f(y) \diff T_\# \mu (y) = \int_X f(T(x)) \diff \mu (x). \]
It is immediate to deduce that \[ T_\# \delta_0 = \delta_{T(0)} \]
In general that \[ T_\# \sum_{i=1}^N a_i \delta_{x_i} = \sum_{i=1}^N a_i \delta_{T(x_i)} \]
Optimal transport and Wasserstein spaces
Optimal transport problem. Monge formulation
Given \(\mu \in \mathcal M(X)\) and \(\nu \in \mathcal M (Y)\),
Set of transports: \[\mathrm{Trans} (\mu, \nu) =\{ T: X \to Y \text{ measurable } \mid \, \nu = T_\# \mu \}\]
Given a “cost of transport” \(c(x,y)\) we look at the optimal transport problem
\[\begin{equation} \label{eq:Monge} \tag{M} \inf_{T \in \mathrm{Trans} (\mu, \nu)} \int_X c(x,T(x)) \, d \mu (x). \end{equation}\]
Clearly \(\mathrm{Trans} (\mu, \nu) = \emptyset\) if \(\mu(X) \ne \nu(Y)\).
We will work over probability distributions \(\mu(X) = 1\) and \(\mu (A) \ge 0\).
The problem of mass splitting:
\[ \mathrm{Trans} \left(\delta_0, \frac 1 2 \delta_0 + \frac 1 2 \delta_1 \right) = \emptyset. \]
Because who is \(T(0)\)?
Optimal transport problem. Kantorovich formulation
Let \(\mu \in \mathcal P (X)\) and \(\nu \in \mathcal P (Y)\)
Instead of working \(T\), we work in the square \(X \times Y\)
\[ \Pi(\mu, \nu) = \{ \pi \in \mathcal P(X \times Y) \mid \qquad \pi(A\times Y) = \mu(A) \text{ and } \pi(X \times B) = \nu(B) \} \]
This set is never empty, \(\mu \otimes \nu \in \Pi (\mu, \nu)\).
\[ (\mu \otimes \nu) (A \times B) = \mu (A) \nu(B). \]
E.g. \(\mu = \delta_{x_0}\) and \(\nu = \tfrac 1 2 \delta_{x_1} + \tfrac 1 2 \delta_{x_2}\)
The Kantorovich problem is \[\begin{equation} \tag{K} \inf_{ \pi \in \Pi (\mu, \nu) } \int_{X \times Y} c(x,y) \diff \pi(x,y). \end{equation}\]
Relation between Monge and Kantorovich formulations
If there is a plan \(T\), then we can take \[\pi_T = (\mathrm{id} \otimes T)_\# \mu.\]
then
\[ \int_{X \times Y} c(x,y) \diff \pi_T (x,y) = \int_{X} c(x,T(x)) \diff \mu (x). \]
Theorem (Pratelli) If \(\mu\) has no atoms and \(c: X\times Y \to [0,\infty)\) is continuous, then \((K) = (M)\). Furthermore, \((K)\) is a \(\min\).
More info: (Villani, 2003), (Villani, 2009), (Ambrosio, Brué & Semola, 2021, Lecture 2)
Theorem (Brenier, Knott-Smith)
If \(X = Y = \Rd\) and \(c(x,y) = |x-y|^2\), \(\mu, \nu \in \mathcal P (\Rd)\), \(\mu \ll \mathcal L^n\) and assumme \[ \int_\Rd |x|^2 \diff \mu(x), \int_\Rd |x|^2 \diff \nu(x) < \infty. \]
Then
\((K)\) is achived with a unique minimiser \(\pi\).
\((M)\) is achieved with a unique minimiser \(T\).
Furthermore \(T = \nabla \psi\) with \(\psi : \Rd \to (-\infty,\infty]\) convex, l.s.c., \(\mu\)-a.e. differentiable.
The optimal transports in \(\mathrm{Trans} (\mu, \nu)\) and \(\mathrm{Trans} (\nu, \mu)\) are inverses of each other.
We also also have that
Let \(\psi : \Rd \to (-\infty,\infty]\) convex, l.s.c., \(\mu\)-a.e. differentiable, and \(|\nabla \psi|^2 \in L^1 (\mu)\).
Then \(T = \nabla \psi\) is an optimal transport between \(\mu\) and \(T_\# \mu\).
More info: (Ambrosio, Brué & Semola, 2021, Lecture 5)
The Wasserstein space
Let \(1 \le p < \infty\). For \(\mu, \nu \in \mathcal P(\Rd)\) take the Wasserstein distance \[ d_p(\mu, \nu) = \left( \inf_{\pi \in \Pi(\mu, \nu)} \int_{\Rd \times \Rd} |x-y|^p \diff \pi (x,y) \right)^{\frac 1 p} \]
We point out that \[ d_p (\mu, \delta_0)^p = \int_{\Rd} |x|^p \diff \mu (x) \]
We take the Wasserstein space \[ \mathcal P_p (\Rd) = \left \{ \mu \in \mathcal P (\Rd) : \int_\Rd |x|^p \diff \mu (x) < \infty \right \} \]
The set of empirical measures is dense in \(( \mathcal P_p (\Rd) , d_p )\):
\[ \left\{ \sum_{i=1}^N a_i \delta_{x_i} : N \in \mathbb N , a_i \ge 0, x_i \in \Rd, \sum_i a_i = 1 \right\} \]
Relation to the push-forward
Proposition. Let \(X: \mathbb R^d \to \mathbb R^d\), and \(\mu, \nu \in \mathcal P_p (\Rd)\).
\[ d_p (X_\# \mu, X_\# \nu) \le \| \nabla X \|_{L^\infty} d_p (\mu, \nu). \]
Proof
Let \(\pi \in \Pi (\mu, \nu)\)
Define \(\widetilde \pi = (X,X)_\# \pi\). Then \(\widetilde \pi = \Pi ( X_\# \mu, X_\# \nu )\) and
\(\displaystyle \int_{\Rd \times \Rd} |x-y|^p \diff (X,X)_\# \pi (x,y) = \int_{\Rd \times \Rd} |X(x)-X(y)|^p \diff \pi (x,y)\)
\(\displaystyle \phantom{\int_{\Rd \times \Rd} |x-y|^p \diff (X,X)_\# \pi (x,y)}\le\|\nabla X\|_{L^\infty}\int_{\Rd \times \Rd} |x-y|^p \diff \pi (x,y)\)
We take \(\inf\) over \(\widetilde \pi\), then \(\inf\) over \(\pi\). \(\square\)
Back to \(\frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0\)
The solution is given \(\rho_t \diff x = (X_t)_\# (\rho_0 \diff x)\) where
\[ \begin{cases} \dfrac{\partial X_t}{\partial t} = v_t(X_t) & t > 0, \\ X(0,y) = y. \end{cases} \]
Then
\[ \begin{cases} \dfrac{\partial}{\partial t} \left( \nabla X_t \right) = ( \nabla v_t (X_t) ) \cdot \nabla X_t & t > 0, \\ \nabla X_0(y) = \mathrm{I}. \end{cases} \]
so we get a nice estimate
\[ \|\nabla X_t\|_{L^\infty} \le \exp\left ( \int_0^t \|\nabla v_s\|_{L^\infty} \diff s \right) . \]
Well-posedness in Wasserstein space
For \[ \frac{\partial \rho}{\partial t} + \diver( \rho v_t(x) ) = 0 \]
we get the continuous dependence estimate
\[ d_p \Big(\rho_t \diff x,\overline \rho_t \diff x\Big) \le \exp\left ( \int_0^t \|\nabla v_s\|_{L^\infty} \right) d_p \Big(\rho_0 \diff x, \overline \rho_0 \diff x\Big). \]
Similarly, \(\mu \in \mathcal C([0,T]; \mathcal P_p (\Rd))\):
\(\displaystyle d_p \Big((X_t)_\# \mu_0, (X_s)_\# \mu_0\Big)^p \le \int_{\Rd \times \Rd}|X_t(x) - X_s(x)|^p \diff \mu_0(x)\) \(\displaystyle \phantom{ d_p \Big((X_t)_\# \mu_0, (X_s)_\# \mu_0\Big)^p} \le \int_{\Rd \times \Rd} (1 + |x|)^p \diff \mu_0(x) \left( \sup_{x \in \mathrm{supp} (\mu_0)} \frac{| X_t(x) - X_s(x)|}{1 + |x|}\right)^p\)
Transport equation with measure data
\[ \frac{\partial \mu}{\partial t} + \diver( \mu v_t(x) ) = 0 \]
Continuous dependence estimate for measure data
\[ d_p(\mu_t, \overline \mu_t) \le \exp\left ( \int_0^t \|\nabla v_s\|_{L^\infty} \right) d_p (\mu_0, \overline \mu_0). \]
But if \[ \mu_0 = \sum_{i=1}^N a_i \delta_{y_i} \]
then
\[ (X_t)_\# \mu_0 = \sum_{i=1}^N a_i \delta_{X_t(y_i)} \]
These are called particle systems. We only need to solve finitely many problems!!
The particle method for \(\frac{\partial \mu}{\partial t} + \diver( \mu v_t(x) ) = 0\)
Take \(\mu_0 \in \mathcal P_p (\Rd)\)
Approximate by \(\mu_0^N = \sum_{i=1}^N a_i \delta_{y_i}\)
Solve for \(i = 1, \cdots, N\) the ODEs \[ \begin{cases} \dfrac{\partial x_t^{(i)}}{\partial t} = v_t (x_t^{(i)}) \\ x_0^{(i)} = y_i. \end{cases} \]
Write the approximate solution \[ \mu^N_t = \sum_{i=1}^N a_i \delta_{x_t^{(i)}} \]
Transport equation with measure data. Distributional solutions
We look at \(\mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t}\) where \(\frac{\diff x^{(i)}_t}{\diff t} = v_t(x^{(i)}_t).\)
Integrating agains a test function \(\int_\Rd \varphi_t(x) \diff \mu_t = \sum_{i=1}^N a_i \varphi({x^{(i)}_t})\).
\(\displaystyle\frac{\partial }{\partial t} \int_\Rd \varphi_t(x) \diff \mu_t = \sum_{i=1}^N a_i \frac{\partial \varphi_t}{\partial t} ({x^{(i)}_t}) + \sum_{i=1}^N a_i \nabla \varphi_t ({x^{(i)}_t}) \cdot \frac{\diff x^{(i)}_t}{\diff t}\)
\(\displaystyle \phantom{\frac{\partial }{\partial t} \int_\Rd \varphi_t(x) \diff \mu_t}= \sum_{i=1}^N a_i \left( \frac{\partial \varphi_t}{\partial t} ({x^{(i)}_t}) + \nabla \varphi_t ({x^{(i)}_t}) v_t(x^{(i)}_t) \right)\)
\(\displaystyle \phantom{\frac{\partial }{\partial t} \int_\Rd \varphi_t(x) \diff \mu_t}= \int_\Rd \left( \frac{\partial \varphi_t}{\partial t} (x) + \nabla \varphi_t (x) v_t(x) \right) \diff \mu_t (x)\)
For \(\varphi \in C_c^\infty([0,+\infty) \times \Rd)\) we have
\[ \int_0^\infty \int_\Rd \left( - \frac{\partial \varphi_t}{\partial t} - \nabla \varphi_t v_t(x) \right) \diff \mu_t \diff t = \int_\Rd \varphi_0(x) \diff \mu_0 \]
Interacting particles
The equation \(\frac{\partial \mu}{\partial t} + \diver( \mu v_t(x) ) = 0\) admits solutions with decoupled particles
\[ \mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t} \qquad \text{where } \frac{\diff x^{(i)}_t}{\diff t} = v_t (x^{(i)}_t) \]
People also care about the model interacting particles, e.g.
\[ \mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t} \qquad \text{where } \frac{\diff x^{(i)}_t}{\diff t} = - \sum_{j=1}^N a_j \nabla W (x^{(i)}_t - x^{(j)}_t). \]
We look at \[ \mu_t = \sum_{i=1}^N a_i \delta_{x^{(i)}_t} \qquad \text{ where } \frac{\diff x^{(i)}_t}{\diff t} = - \sum_{j=1}^N a_j \nabla W (x^{(i)}_t - x^{(j)}_t). \]
First notice that \[ \sum_{j=1}^N a_j \nabla W (x^{(i)}_t - x^{(j)}_t) = \int_\Rd \nabla W(x^{(i)}_t - y) \diff \mu_t(y) = \nabla W * \mu_t (x^{(i)}_t) = - v_t(x^{(i)}_t; \mu) \]
As before, for \(\varphi \in C_c^\infty([0,+\infty) \times \Rd)\) we have
\[ \int_0^\infty \int_\Rd \left( - \frac{\partial \varphi_t}{\partial t} + \nabla \varphi_t \nabla W * \mu_t \right) \diff \mu_t \diff t = \int_\Rd \varphi_0(x) \diff \mu_0 \]
Then, distributionally \[\begin{equation} \tag{AE} \frac{\partial}{\partial t} \mu_t = \diver (\mu_t \nabla W * \mu_t) \end{equation}\]
This is a non-local PDE. The solution does not have a natural structure \((X_t)_\# \mu_0\).
Benamou-Brenier formula
We showed that if you like some PDEs, Wasserstein can help.
The reverse is also true. For two given measures \(\mu_0, \mu_1\) we look at all the possible conservation equation liking them
Define \(\mathrm{Conv}(\mu_0, \mu_1)\) as the set of pairs \((\mu, v)\)
- \(\mu : [0,1] \to \mathcal P_2 (\Rd)\) continuous w.r.t. weak topology
- \(v : [0,1] \times \Rd \to \Rd\) Borel
- That are related by \(\partial_t \mu_t + \diver(v_t \mu_t) = 0\)
Then \[ d_2(\mu_0, \mu_1)^2 = \inf_{(\mu,v) \in \mathrm{Conv}(\mu_0, \mu_1)} \int_0^1 \int_\Rd |v_t|^2 (x) \diff \mu_t(x) \diff t \]
See (Ambrosio, Brué & Semola, 2021, Lecture 17)
Connection between the continuity equation and curves
The space \(AC^p([0,1]; M)\) is the space of functions \(\gamma:[0,1] \to M\) such that
there exists \(g \in L^p (0,1)\) such that \[ d(\gamma_y,\gamma_x) \le \int_x^y g(t) \diff t, \qquad \forall 0 \le x \le y \le 1. \]
Theorem Let \(\mu_t \in AC^2([0,1]; \mathcal P_2(\Rd))\). Then, there exists a velocity field \(v_t\) such that \(\mu\) is a solution of \[ \frac{\partial \mu_t}{\partial t} + \diver(v_t \mu_t) = 0. \]
Gradient flows in metric spaces
Gradient flows in \(\Rd\)
Let \(F: \Rd \to \mathbb R\). The gradient flow is \[ \frac{\diff u}{\diff t} = - \nabla F (u) \]
If \(D^2 F \ge \lambda I\) then \[|u(t) - \overline u(t) |\le e^{-\lambda t} |u(0) - \overline u(0)|.\]
If \(F\) is strictly convex, for any \(u(0)\) we have \[u(t) \to u_\infty = \argmin F.\]
Gradient flows in Hilbert spaces
Let \(H\) be a Hilbert space, \(\mathcal F: \mathrm{Dom}(\mathcal F) \subset H \to \mathbb R\) Gateaux diffentiable
The dynamical system given by
\[ \frac{\diff u}{\diff t} = - \mathcal F'(u) \]
If \(\mathcal F\) is convex, there is surely a unique minimiser. And there are similar properties to the \(\Rd\) case.
Example. (Heat equation) \(\frac{\partial \rho}{\partial t} = \Delta u\).
Corresponds to \(H = L^2(\Rd)\) and \(\mathcal F(u) = \int_\Rd |\nabla u|^2\)
Remark. If \(\mathcal F\) is convex, the implicit Euler is convergent (Crandall & Liggett, 1971)
\[ \frac{U_{n+1}^{(\tau)} - U_n^{(\tau)}}{\tau} = - \mathcal F'(U_{n+1}^{(\tau)}) \]
and each step can be recovered from \(\displaystyle U_{n+1}^{(\tau)} \in \argmin_{x \in H} \left(\frac{\| x - U_n^{(\tau)} \|_H^2 }{2\tau} + \mathcal F(x) \right)\)
Otto’s calculus (Otto, 1996), (Otto, 2001). Formal tangent bundle
We can think formally about the tangent space \(T_{\rho_0} \mathcal P_2 (\Rd)\).
A generic element \(T_{\rho_0} M\) is the derivative of a curve passin by \(\rho_0\).
A generic curve \(t \in [0,1 ] \to \rho_t\) is a solution of the continuity equation in distributional sense \[ \frac{\partial \rho_t}{\partial t} = - \diver( \rho_t \nabla \psi_t ) \]
Formally the derivative at \(0\) \[ s = \frac{\partial}{\partial t} \Bigg|_{ t=0 } \rho_t = - \diver( \rho_0 \nabla \psi_0 ) \]
We can map \(s \in T_{\rho_0} \mathcal P_2 (\Rd)\) to some gradient field \(\nabla \psi_0\).
We can even define the metric tensor
\[ \langle s , \overline s \rangle_{\rho_0} = \int_\Rd \langle \nabla \psi, \nabla \overline \psi \rangle \diff \rho_0. \]
Formal gradient in \(\mathcal P_2\).
Let us do first the example \(\displaystyle \mathcal F[\rho \diff x] = \int_\Rd U(\rho) \diff x.\)
Take \(\xi = \nabla \zeta\) with \(\zeta \in C_c^\infty (\Rd)\). Let \(\rho_\ee := (1_\Rd + \ee \xi )_\# \rho_0 .\)
The map \((x, \ee) \mapsto \rho_\ee (x)\) is \(C^2\) and \(\displaystyle \qquad \lim_{\ee \to 0} \rho_\ee = \rho_0, \qquad \frac{\partial }{\partial \ee }\Big|_{\ee = 0} \rho_\ee = -\diver( \rho \xi ).\)
\[ \lim_{\ee \to 0} \frac{\mathcal F [\rho_\ee] - \mathcal F[\rho_0]}{\ee} = - \int_\Rd U'(\rho_0) \nabla \cdot (\rho_0 \xi) = \int_\Rd \nabla \zeta \nabla U'(\rho_0) \diff \rho_0 \]
In distributional sense, we have that
\[ \nabla_{d_2} \left( \int_\Rd U(\rho) \right) = - \diver \left( \rho \nabla U'(\rho) \right) . \]
Formal gradient in \(\mathcal P_2\).
Similarly, using variation formulae (see (Giaquinta & Hildebrandt, 1996)), for general \(\mathcal F\) we get \[ \lim_{\ee \to 0} \frac{\mathcal F [\rho_\ee] - \mathcal F[\rho_0]}{\ee} = \int_\Rd \nabla \zeta \nabla \frac{\delta \mathcal F}{\delta \rho} [\rho_0] \diff \rho \]
In distributional sense, we have that
\[ \nabla_{d_2} \mathcal F = - \diver \left( \rho \nabla \frac{\delta \mathcal F}{\delta \rho} \right) . \]
More details: (Ambrosio, Gigli & Savare, 2005:sec.10.4.1), (Ambrosio, Brué & Semola, 2021, Lecture 18)
Extension to \(\nabla_{d_p}\) is also available (Otto, 1996), and yields \(p\)-Laplacian flavoured variants.
Formal gradient flows in \(\mathcal P_2\)
Since \(\nabla_{d_2} \mathcal F = - \diver ( \rho \nabla \frac{\delta \mathcal F}{\delta \rho} )\) we have that
\[ \frac{\partial \rho_t}{\partial t} = \diver \left( \rho \nabla \frac{\delta \mathcal F}{\delta \rho} \right) \] is the \(2\)-Wasserstein gradient flow of the energy \(\mathcal F\).
Example. (The Heat Equation) \(\displaystyle \frac{\partial \rho}{\partial t} = \Delta \rho\)
is the formal \(\mathcal P_2\) gradient flow of the Boltzmann entropy
\[\displaystyle \mathcal F[\rho] = \int_\Rd \rho \log \rho.\]
Example. (The Porous Medium Equation) \(\displaystyle \frac{\partial \rho}{\partial t} = \Delta \rho^m\) corresponds to \(\mathcal F [\mu] = \int_{\Rd} \frac m {m-1} \rho^m\).
Example. (Aggregation Equation) \(\frac{\partial \mu_t}{\partial t} = \diver (\mu_t \nabla W * \mu_t)\) to \(\mathcal F [\mu] = \tfrac 1 2 \int_\Rd (W * \mu) \diff \mu.\)
For existence and uniqueness of the “correct type” of solutions see (Carrillo et al., 2011).
Formal gradient flows in \(\mathcal P_2\)
Example. (Aggregation-Diffusion Equation)
\[ \frac{\partial \rho}{\partial t} = \diver\left( \rho \nabla \left( U'(\rho) + V + W*\rho \right) \right) \]
is formally the \(\mathcal P_2\)-gradient flow of the free energy
\[ \mathcal F[\rho] = \int_\Rd \Big( U(\rho) + V\rho + \frac 1 2 \rho (\rho*W) \Big) \diff x. \]
There is a correct notion on “convexity” that reproduces the usual properties
This problem allows for nice relative entropy arguments
Y esta es una bella historia para otro día.
Charla el 25 de noviembre de 2022 en el seminario de la UAM.
Rigurous gradient flows in metric spaces: curves of maximal slope
When \(X\) is a Banach space, \(\displaystyle \frac{\partial \rho}{\partial t} = - \nabla_{X} \mathcal F [\rho(t)]\) in \(X^*\).
The main idea is the equivalence for \(u : [0,T] \to \Rd\) that \[ u'(t) = - \nabla \mathcal F (u), \qquad \iff \qquad \begin{cases} \dfrac{\diff }{\diff t} (\mathcal F \circ u ) = -| \nabla \mathcal F (u) | |u'| & \text{orientation} \\ |u'| = |\nabla \mathcal F (u)| & \text{norm} \end{cases} \]
We define the metric slopes \[ | \mu' | (t)= \limsup_{h \to 0} \frac{ d(\mu(t+h) , \mu(t)) }{h}, \qquad | \partial \mathcal F | [\mu] = \limsup_{\nu \to \mu} \frac{ (\mathcal F [\mu] - \mathcal F[\nu])_+ }{d (\mu, \nu)}\]
Definition (curve of maximal slope)
A locally abs. cont. curve \(t \mapsto \mu (t) \in M\) such that \(t \mapsto \mathcal F[\mu(t)]\) is abs. cont. and \[ \frac 1 2 \int_s^t |\mu'|^2(r) \diff r + \frac 1 2 \int_s^t |\partial \mathcal F|^2 [\mu(r)] \diff r \le \mathcal F [\mu(s)] - \mathcal F [\mu(t)] \qquad \forall 0 \le s < t \le T \]
“Implicit Euler” in metric spaces. The JKO scheme
Let \((M,d)\) be a metric space and \(\mathcal F: M \to \mathbb R\).
The JKO scheme (Jordan, Kinderlehrer & Otto, 1998) extends the implicit Euler scheme in the minimisation form
\[\begin{equation} \tag{JKO} U_{n+1}^{(\tau)} \in \argmin_{x \in M} \left(\frac{d(x,U_n^{(\tau)})^2 }{2\tau} + \mathcal F(x) \right). \end{equation}\]
To be more rigorous, define \[ u_t^{(\tau)} = U_{n}^{(\tau)} \quad \text{for } n \tau \le t < (n+1) \tau. \]
(Ambrosio, Gigli & Savare, 2005) is explains how JKO converges to curves of maximal slope in \(\mathcal P_2\).
Numerics (Carrillo et al., 2019)
Y esta conversación queda para otro día
A mention on the Fields Medals
Cédric Villani won the fields medal in 2010 (see (Yau, 2011))
the Boltzmann and Landau equations:
relative entropy arguments
Ricci flow as a gradient flow:
(Otto & Villani, 2000) and (Lott & Villani, 2009).
Alessio Figalli won the Fields in 2018.
Thesis 2007: Optimal transportation and action-minimizing measures
He used this to study the Mongé-Ampere equation.
Hardy-Littlewood-Sobolev inequalities
obstacle problem (with L. Caffarelli)
fractional Laplacian
Beyond characteristics and gradient flows:
the duality approach
Multi-species problems
Consider multiple interaction species indexed by \(k = 1, \cdots ,K\)
and each species with several particles indexed by \(i\).
Between each two species
Then \[ \frac{\diff}{\diff t} x^{(k,i)}_t = - \sum_\ell \sum_{j} a_{j,k} \nabla W_{k,\ell} (x^{(k,i)}_t - x^{(\ell,j)}_t). \]
This gives a system of PDEs for the empirical measures \[ \left\{ \sum_{i=1}^N a_i \delta_{x_i} : N \in \mathbb N , a_i \ge 0, x_i \in \Rd, \sum_i a_i = 1 \right\} \]
These are not gradient flows, except in particular cases (Di Francesco, Esposito & Fagioli, 2018)
Vanishing viscosity approximation
If we introduce Brownian noise to the particles, then it is natural to study \[ \frac{\partial}{\partial t} \mu^{(k)}_t = \ee \Delta \mu^{(k)} + \diver\left(\mu^{(k)} \nabla \sum_{\ell} W_{k,\ell} * \mu^{(\ell)} \right). \]
This is also the vanishing viscosity approximation.
It does not admit characteristics.
In (Carrillo & G-C, 2022) we study well-posedness by duality
(no characteristics and no gradient flow structure)
Duality approach to \(\mathcal P_1\)
The duality characterisation (Villani, 2003, Theorem 1.14)
\[ d_1 (\mu, \nu) = \sup \left\{ \int_\Rd \psi \diff (\mu - v) : \Lip(\psi) \le 1 \right\}. \]
Duality approach to PDEs
We will look at the problems, for \(\ee \ge 0\) \[ \partial_t \mu_t + \diver (\mu v_t(x)) = \ee \Delta \mu_t \]
The distributional formulation \[ \int_\Rd \varphi_T \diff \mu_T - \int_0^T \int_\Rd \left( \frac{\partial \varphi_t}{\partial t} + v_t(x) \nabla \varphi_t + \ee \Delta \varphi_t \right) \diff \mu_t = \int_\Rd \varphi_0 \diff \mu_0. \]
The central term can be cancelled if we take \(\varphi_t = \psi_{T-t}\) given by the adjoint problem \[ \frac{\partial \psi_s}{\partial s} = \nabla \psi_s \cdot v_{T-s}(x) + \ee \Delta \psi_s \]
We get \[ \int_\Rd \psi_0 \diff \mu_T = \int_\Rd \psi_T \diff \mu_0. \]
Duality approach to PDEs. (Carrillo & G-C, 2022)
We will look at the problems, for \(\ee \ge 0\) \[ \partial_t \mu_t + \diver (\mu v_t(x)) = \ee \Delta \mu_t, \qquad \qquad \partial_t \overline \mu_t + \diver (\overline \mu_t \overline v_t(x)) = \ee \Delta \overline \mu_t. \]
We take the test functions \[ \partial_s \psi_s = \nabla \psi_s \cdot v_{T-s}(x) + \ee \Delta \psi_s \qquad \qquad \partial_s \overline \psi_s = \nabla \overline \psi_s \cdot v_{T-s}(x) + \ee \Delta \overline \psi_s \] with \(\psi_0 = \overline \psi_0\)
Then \(\displaystyle \int_\Rd \psi_0 \diff (\mu_T - \overline \mu_T) = \int_\Rd \psi_T \diff \mu_0 - \int_\Rd \overline \psi_T \diff \overline \mu_0\)
Or even better \(\displaystyle \int_\Rd \psi_0 \diff (\mu_T - \overline \mu_T) = \int_\Rd \psi_T \diff (\mu_0 - \overline \mu_0) + \int_\Rd (\psi_T - \overline \psi_T) \diff \overline \mu_0\)
Thus
\[ \begin{aligned} d_1(\mu_t, \overline \mu_t) & \le \underbrace{ d_1 (\mu_0, \overline \mu_0) \sup_{\Lip(\psi_0) \le 1} \| \Lip(\psi_T) \|_{L^\infty} }_{\textrm{Cont. dependence on }\mu_0} + \underbrace{ \left( 1 + \int_\Rd |x| \diff \overline \mu_0 \right) \sup_{\Lip(\psi_0) \le 1} \left \| \frac{\psi_T - \overline \psi_T}{1+|x|} \right \|_{L^\infty} }_{\textrm{Cont. dependence on } v} . \end{aligned} \]
Dual-viscosity formulation (Carrillo & G-C, 2022)
Let us go back to the system \[ \frac{\partial}{\partial t} \mu^{(k)}_t = \ee \Delta \mu^{(k)} + \diver\left(\mu^{(k)} \nabla \sum_{\ell} W_{k,\ell} * \mu^{(\ell)} \right) \]
for simplicity in this presentation \(W_{k,\ell}\) with \(\nabla W_{k,\ell} \in \Lip\).
Definition. (dual-viscosity solution) \(\mu^{(k)} \in C( [0,\infty), \mathcal P_1(\Rd) )\) such that
For each \(T\ge 0\) \(k\) and \(\psi_0\) with \(\Lip(\psi_0)\) and \(\psi^{(k),T}\) the unique viscosity solution of \[ \frac{\partial}{\partial t} \psi = \ee \Delta \psi + \nabla \psi \cdot \nabla \left( \sum_{\ell} W_{k,\ell} * \mu^{(\ell)} \right) \]
then \(\displaystyle \int_\Rd \psi_0 \diff \mu_T = \int_\Rd \psi_T \diff \mu_0.\)
Dual-viscosity formulation (Carrillo & G-C, 2022)
Theorem. For all \(\varepsilon \ge 0\) well-posedness of dual-viscosity solutions.
Our framework covers more general settings: \(\nabla \sum_{\ell} W_{k,\ell} * \mu_\ell\) replaced by general \(v_t[\mu]\).
Proof: A priori estimates for \(\psi\) and fixed point argument. \(\square\)
Otra bella historia. Que queda para otro día.
Conclusion
- The conservation laws \[\begin{equation} \tag{C} \frac{\partial \rho_t}{\partial t} + \diver (\rho_t v_t) = 0 \end{equation}\] are solved as the push-forward through characteristics
- Wasserstein spaces are natural to study (C)
In fact, (C) characterise \(\mathcal P_2\)
by Benamou-Brenier
Many other conservation laws are
gradient flows in Wasserstein space
- (Carrillo & G-C, 2022) introduces a new approach beyond push-forward and gradient flows