∇ ⋅ E = 0 ∇ ⋅ B = 0 ∇ × E = − ∂ B ∂ t ∇ × B = ∂ E ∂ t
\begin{align*}
\nabla \cdot \mathbf{E} &= 0 \\
\nabla \cdot \mathbf{B} &= 0 \\
\nabla \times \mathbf{E} &= -\frac{\partial \mathbf{B}}{\partial t} \\
\nabla \times \mathbf{B} &= \frac{\partial \mathbf{E}}{\partial t}
\end{align*}
∇ ⋅ E ∇ ⋅ B ∇ × E ∇ × B = 0 = 0 = − ∂ t ∂ B = ∂ t ∂ E
Modeling Linear PDE Systems with Probabilistic Machine Learning
Markus Lange-Hegermann
Institut für industrielle Informationstechnik - inIT
Department of Electrical Engineering and Computer Science, TH OWL
Bogdan Raiță
Department of Mathematics and Statistics
Georgetown University
With many, many contributions by Andreas Besginow, Marc Härkönen, Jianlei Huang, Xin Li, and Daniel Robertz
Differential Equations and Data: Why?
Combine expert knowledge and data.
Want intepretable and reliable models.
Focus on (irregularly spaced) data rather than IC/BC.
Numerical simulations might be computationally too costly.
Example
Damped spring-mass system
d 2 y d t 2 + 2 d y d t + 10 y = 0 \frac{\mathrm{d}^2y}{\mathrm{d}t^2} + 2 \frac{\mathrm{d}y}{\mathrm{d}t} + 10y = 0 d t 2 d 2 y + 2 d t d y + 10 y = 0
Sample 5 noisy points
Example
Neural Network
⚠️ ~300,000 parameters
✅ Fast convergence
❌ Bad inter/extrapolation
Example
Physics Informed Neural Network
✅ Good interpolation
✅ Fast inference
⚠️ ~300,000 parameters
⚠️ Requires extra points
⚠️ Not an exact solution
❌ Slow convergence
❌ Hard to optimize
A more constrained approach
How to solve?
d 2 y d t 2 + 2 d y d t + 10 y = 0 \frac{\mathrm{d}^2y}{\mathrm{d}t^2} + 2 \frac{\mathrm{d}y}{\mathrm{d}t} + 10y = 0 d t 2 d 2 y + 2 d t d y + 10 y = 0
Algebraic preprocessing: characteristic frequencies
z 2 + 2 z + 10 = 0 ⇒ z = − 1 ± 3 − 1 z^2 + 2z + 10 = 0 \Rightarrow z = -1 \pm 3\sqrt{-1} z 2 + 2 z + 10 = 0 ⇒ z = − 1 ± 3 − 1
Solution Space
y ( t ) = ( c 1 ⋅ e − t cos 3 t + c 2 ⋅ e − t sin 3 t ) y(t) = (\textcolor{#b51963}{c_1} \cdot e^{-t} \cos 3t + \textcolor{#b51963}{c_2} \cdot e^{-t} \sin 3t) y ( t ) = ( c 1 ⋅ e − t cos 3 t + c 2 ⋅ e − t sin 3 t )
A more constrained approach
How to solve?
d 2 y d t 2 + 2 d y d t + 10 y = 0 \frac{\mathrm{d}^2y}{\mathrm{d}t^2} + 2 \frac{\mathrm{d}y}{\mathrm{d}t} + 10y = 0 d t 2 d 2 y + 2 d t d y + 10 y = 0
Algebraic preprocessing: characteristic frequencies
z 2 + 2 z + 10 = 0 ⇒ z = − 1 ± 3 − 1 z^2 + 2z + 10 = 0 \Rightarrow z = -1 \pm 3\sqrt{-1} z 2 + 2 z + 10 = 0 ⇒ z = − 1 ± 3 − 1
Solution Space
y ( t ) = ( c 1 ⋅ e − t cos 3 t + c 2 ⋅ e − t sin 3 t ) y(t) = (\textcolor{#b51963}{c_1} \cdot e^{-t} \cos 3t + \textcolor{#b51963}{c_2} \cdot e^{-t} \sin 3t) y ( t ) = ( c 1 ⋅ e − t cos 3 t + c 2 ⋅ e − t sin 3 t )
Gaussian process prior
Y ( t ) ∼ ( C 1 ⋅ e − t cos 3 t + C 2 ⋅ e − t sin 3 t ) + ϵ Y(t) \sim (\textcolor{#b51963}{C_1} \cdot e^{-t} \cos 3t + \textcolor{#b51963}{C_2} \cdot e^{-t} \sin 3t) + \textcolor{#b51963}{\epsilon} Y ( t ) ∼ ( C 1 ⋅ e − t cos 3 t + C 2 ⋅ e − t sin 3 t ) + ϵ
where
C 1 ∼ N ( 0 , σ 1 ) , C 2 ∼ N ( 0 , σ 2 ) , ϵ ∼ N ( 0 , σ 0 )
\textcolor{#b51963}{C_1 \sim \mathcal{N}(0, \sigma_1)},
\textcolor{#b51963}{C_2 \sim \mathcal{N}(0, \sigma_2)},
\textcolor{#b51963}{\epsilon \sim \mathcal{N}(0, \sigma_0)}
C 1 ∼ N ( 0 , σ 1 ) , C 2 ∼ N ( 0 , σ 2 ) , ϵ ∼ N ( 0 , σ 0 )
A more constrained approach
How to solve?
d 2 y d t 2 + 2 d y d t + 10 y = 0 \frac{\mathrm{d}^2y}{\mathrm{d}t^2} + 2 \frac{\mathrm{d}y}{\mathrm{d}t} + 10y = 0 d t 2 d 2 y + 2 d t d y + 10 y = 0
Algebraic preprocessing: characteristic frequencies
z 2 + 2 z + 10 = 0 ⇒ z = − 1 ± 3 − 1 z^2 + 2z + 10 = 0 \Rightarrow z = -1 \pm 3\sqrt{-1} z 2 + 2 z + 10 = 0 ⇒ z = − 1 ± 3 − 1
Solution Space
y ( t ) = ( c 1 ⋅ e − t cos 3 t + c 2 ⋅ e − t sin 3 t ) y(t) = (\textcolor{#b51963}{c_1} \cdot e^{-t} \cos 3t + \textcolor{#b51963}{c_2} \cdot e^{-t} \sin 3t) y ( t ) = ( c 1 ⋅ e − t cos 3 t + c 2 ⋅ e − t sin 3 t )
Gaussian process prior
Y ( t ) ∼ ( C 1 ⋅ e − t cos 3 t + C 2 ⋅ e − t sin 3 t ) + ϵ Y(t) \sim (\textcolor{#b51963}{C_1} \cdot e^{-t} \cos 3t + \textcolor{#b51963}{C_2} \cdot e^{-t} \sin 3t) + \textcolor{#b51963}{\epsilon} Y ( t ) ∼ ( C 1 ⋅ e − t cos 3 t + C 2 ⋅ e − t sin 3 t ) + ϵ
where
C 1 ∼ N ( 0 , σ 1 ) , C 2 ∼ N ( 0 , σ 2 ) , ϵ ∼ N ( 0 , σ 0 )
\textcolor{#b51963}{C_1 \sim \mathcal{N}(0, \sigma_1)},
\textcolor{#b51963}{C_2 \sim \mathcal{N}(0, \sigma_2)},
\textcolor{#b51963}{\epsilon \sim \mathcal{N}(0, \sigma_0)}
C 1 ∼ N ( 0 , σ 1 ) , C 2 ∼ N ( 0 , σ 2 ) , ϵ ∼ N ( 0 , σ 0 )
Algebra: determine suitable frequencies
Stochastic: weigh frequencies
Gaussian Process
(or ridge regression)
✅ 3 parameters
✅ Fast convergence
✅ Good inter/extrapolation
✅ Exact solution
✅ No extra points
Overview
(Of course, there are many PINN approaches, numerical solvers, and alternatives like neural operators.)
EPGP (ours) S-EPGP(ours) Vanilla PINN Vanilla Num. Solver
Differential equations linear, c.c. linear, c.c. any well-posed
Additional Information data data data IC/BC
Evaluation time ≈ \approx ≈ data≈ \approx ≈ sparsityneural net mediocre
Solutions? yes yes approx. near data yes
Prerequisites integral - - -
Sampling yes yes no not applicable
Another ODE example
How to solve?
d 4 y d t 4 + 2 d 2 y d t 2 + y = 0 \frac{\mathrm{d}^4y}{\mathrm{d}t^4} + 2 \frac{\mathrm{d}^2y}{\mathrm{d}t^2} + y = 0 d t 4 d 4 y + 2 d t 2 d 2 y + y = 0
Another ODE example
How to solve?
d 4 y d t 4 + 2 d 2 y d t 2 + y = 0 \frac{\mathrm{d}^4y}{\mathrm{d}t^4} + 2 \frac{\mathrm{d}^2y}{\mathrm{d}t^2} + y = 0 d t 4 d 4 y + 2 d t 2 d 2 y + y = 0
Algebraic preprocessing: characteristic frequencies
z 4 + 2 z 2 + 1 = 0 ⇒ z = ± i with multiplicity 2 z^4 + 2z^2 + 1 = 0 \Rightarrow z = \pm i\text{ with multiplicity 2} z 4 + 2 z 2 + 1 = 0 ⇒ z = ± i with multiplicity 2
Another ODE example
How to solve?
d 4 y d t 4 + 2 d 2 y d t 2 + y = 0 \frac{\mathrm{d}^4y}{\mathrm{d}t^4} + 2 \frac{\mathrm{d}^2y}{\mathrm{d}t^2} + y = 0 d t 4 d 4 y + 2 d t 2 d 2 y + y = 0
Algebraic preprocessing: characteristic frequencies
z 4 + 2 z 2 + 1 = 0 ⇒ z = ± i with multiplicity 2 z^4 + 2z^2 + 1 = 0 \Rightarrow z = \pm i\text{ with multiplicity 2} z 4 + 2 z 2 + 1 = 0 ⇒ z = ± i with multiplicity 2
Solution Space
y ( t ) = c 1 t e − i t + c 2 e − i t + c 3 t e i t + c 4 e i t y(t) = \textcolor{#b51963}{c_1}\textcolor{#0073e6}{t}e^{\textcolor{#0073e6}{-i}t} + \textcolor{#b51963}{c_2}e^{\textcolor{#0073e6}{-i}t}+\textcolor{#b51963}{c_3}\textcolor{#0073e6}{t}e^{\textcolor{#0073e6}{i}t}+\textcolor{#b51963}{c_4}e^{\textcolor{#0073e6}{i}t} y ( t ) = c 1 t e − i t + c 2 e − i t + c 3 t e i t + c 4 e i t
Another ODE example
How to solve?
d 4 y d t 4 + 2 d 2 y d t 2 + y = 0 \frac{\mathrm{d}^4y}{\mathrm{d}t^4} + 2 \frac{\mathrm{d}^2y}{\mathrm{d}t^2} + y = 0 d t 4 d 4 y + 2 d t 2 d 2 y + y = 0
Algebraic preprocessing: characteristic frequencies
z 4 + 2 z 2 + 1 = 0 ⇒ z = ± i with multiplicity 2 z^4 + 2z^2 + 1 = 0 \Rightarrow z = \pm i\text{ with multiplicity 2} z 4 + 2 z 2 + 1 = 0 ⇒ z = ± i with multiplicity 2
Solution Space
y ( t ) = c 1 t e − i t + c 2 e − i t + c 3 t e i t + c 4 e i t y(t) = \textcolor{#b51963}{c_1}\textcolor{#0073e6}{t}e^{\textcolor{#0073e6}{-i}t} + \textcolor{#b51963}{c_2}e^{\textcolor{#0073e6}{-i}t}+\textcolor{#b51963}{c_3}\textcolor{#0073e6}{t}e^{\textcolor{#0073e6}{i}t}+\textcolor{#b51963}{c_4}e^{\textcolor{#0073e6}{i}t} y ( t ) = c 1 t e − i t + c 2 e − i t + c 3 t e i t + c 4 e i t
Gaussian process prior
Y ( t ) ∼ C 1 t e − i t + C 2 e − i t + C 3 t e i t + C 4 e i t + ϵ Y(t) \sim \textcolor{#b51963}{C_1}\textcolor{#0073e6}{t}e^{\textcolor{#0073e6}{-i}t} + \textcolor{#b51963}{C_2}e^{\textcolor{#0073e6}{-i}t}+\textcolor{#b51963}{C_3}\textcolor{#0073e6}{t}e^{\textcolor{#0073e6}{i}t}+\textcolor{#b51963}{C_4}e^{\textcolor{#0073e6}{i}t} + \textcolor{#b51963}{\epsilon} Y ( t ) ∼ C 1 t e − i t + C 2 e − i t + C 3 t e i t + C 4 e i t + ϵ
where
C j ∼ N ( 0 , σ j ) , ϵ ∼ N ( 0 , σ 0 )
\textcolor{#b51963}{C_j \sim \mathcal{N}(0, \sigma_j)},
\textcolor{#b51963}{\epsilon \sim \mathcal{N}(0, \sigma_0)}
C j ∼ N ( 0 , σ j ) , ϵ ∼ N ( 0 , σ 0 )
Principle for ODEs
How to solve?
∑ j = 0 k α j d j y d t j = 0 \sum_{j=0}^k \alpha_j\frac{\mathrm d^jy}{\mathrm d t^j} = 0 j = 0 ∑ k α j d t j d j y = 0
Principle for ODEs
How to solve?
∑ j = 0 k α j d j y d t j = 0 \sum_{j=0}^k \alpha_j\frac{\mathrm d^jy}{\mathrm d t^j} = 0 j = 0 ∑ k α j d t j d j y = 0
Algebraic preprocessing: characteristic frequencies
∑ j = 0 k α j z j = 0 ⇒ z ∈ V = complex roots with multiplicities \sum_{j=0}^k \alpha_jz^j=0 \Rightarrow z\in V = \text{ complex roots with multiplicities} j = 0 ∑ k α j z j = 0 ⇒ z ∈ V = complex roots with multiplicities
Principle for ODEs
How to solve?
∑ j = 0 k α j d j y d t j = 0 \sum_{j=0}^k \alpha_j\frac{\mathrm d^jy}{\mathrm d t^j} = 0 j = 0 ∑ k α j d t j d j y = 0
Algebraic preprocessing: characteristic frequencies
∑ j = 0 k α j z k = 0 ⇒ z ∈ V = complex roots with multiplicities \sum_{j=0}^k \alpha_jz^k=0 \Rightarrow z\in V = \text{ complex roots with multiplicities} j = 0 ∑ k α j z k = 0 ⇒ z ∈ V = complex roots with multiplicities
Solution Space
y ( t ) = ∑ z ∈ V ∑ multipliers for z c h B h ( t ) e z t y(t) = \sum_{\textcolor{#0073e6}{z}\in V}\sum_{\text{multipliers for z}}\textcolor{#b51963}{c_{h}}\textcolor{#0073e6}{B_h(t)}e^{\textcolor{#0073e6}{z}t} y ( t ) = z ∈ V ∑ multipliers for z ∑ c h B h ( t ) e z t
Principle for ODEs
How to solve?
∑ j = 0 k α j d j y d t j = 0 \sum_{j=0}^k \alpha_j\frac{\mathrm d^jy}{\mathrm d t^j} = 0 j = 0 ∑ k α j d t j d j y = 0
Algebraic preprocessing: characteristic frequencies
∑ j = 0 k α j z k = 0 ⇒ z ∈ V = complex roots with multiplicities \sum_{j=0}^k \alpha_jz^k=0 \Rightarrow z\in V = \text{ complex roots with multiplicities} j = 0 ∑ k α j z k = 0 ⇒ z ∈ V = complex roots with multiplicities
Solution Space
y ( t ) = ∑ z ∈ V ∑ multipliers for z c h B h ( t ) e z t y(t) = \sum_{\textcolor{#0073e6}{z}\in V}\sum_{\text{multipliers for z}}\textcolor{#b51963}{c_{h}}\textcolor{#0073e6}{B_h(t)}e^{\textcolor{#0073e6}{z}t} y ( t ) = z ∈ V ∑ multipliers for z ∑ c h B h ( t ) e z t
Principle for ODEs:
All solutions are linear combination of exponential-polynomial solutions.
Fourier & Heat Equation
∂ T ∂ t − ∂ 2 T ∂ x 2 = 0 \frac{\partial T}{\partial t}-\frac{\partial^2 T}{\partial x^2}=0 ∂ t ∂ T − ∂ x 2 ∂ 2 T = 0
Fourier & Heat Equation
∂ T ∂ t − ∂ 2 T ∂ x 2 = 0 \frac{\partial T}{\partial t}-\frac{\partial^2 T}{\partial x^2}=0 ∂ t ∂ T − ∂ x 2 ∂ 2 T = 0
Fourier (1807):
Noticed that T z ( x , t ) = e − z 2 t + i z x = e − z 2 t ( cos z x + i sin z x ) T_z(x,t)=e^{\textcolor{#0073e6}{-z^2}t+\textcolor{#0073e6}{iz}x}=e^{-z^2t}(\cos zx+i\sin zx) T z ( x , t ) = e − z 2 t + i z x = e − z 2 t ( cos z x + i sin z x ) are solutions.
Noticed linearity: T c ( x , z ) = ∑ z = − N N c z e − z 2 t + i z x T^c(x,z)=\sum_{z=-N}^N \textcolor{#b51963}{c_z}e^{\textcolor{#0073e6}{-z^2}t+\textcolor{#0073e6}{iz}x} T c ( x , z ) = ∑ z = − N N c z e − z 2 t + i z x are solutions.
Postulated density: All solutions T T T can be approximated with solutions T c T^c T c .
Solved from initial condition: If T ( x , 0 ) = g ( x ) T(x,0)=g(x) T ( x , 0 ) = g ( x ) and T ≈ T c T\approx T^c T ≈ T c then
g ( x ) = ∑ z ∈ Z c z e i z x . g(x)= \sum_{z\in\mathbb Z} \textcolor{#b51963}{c_z}e^{\textcolor{#0073e6}{iz}x}. g ( x ) = z ∈ Z ∑ c z e i z x .
Invented Fourier series.
Our goals
Use additional knowledge of linear PDE systems with constant coefficients.
Construct a computationally nice probabily distribution on such PDE systems.
Restricting to this important special case allows more suitable methods.
Introduce two such methods: EPGP and S-EPGP.
Principle for PDEs
All solutions are linear combination of exponential-polynomial solutions up to approximation.
Let A ∈ R ℓ ′ × ℓ A \in R^{\ell' \times {\ell}} A ∈ R ℓ ′ × ℓ for R = R [ ∂ x 1 , … , ∂ x n ] R=\R[\partial_{x_1},\ldots,\partial_{x_n}] R = R [ ∂ x 1 , … , ∂ x n ]
Want to solve A f = 0 Af=0 A f = 0 for f = ( f 1 , f 2 , … , f ℓ ) f=(f_1,f_2,\ldots,f_\ell) f = ( f 1 , f 2 , … , f ℓ ) .
Principle for PDEs
All solutions are linear combination of exponential-polynomial solutions up to approximation.
Let A ∈ R ℓ ′ × ℓ A \in R^{\ell' \times {\ell}} A ∈ R ℓ ′ × ℓ for R = R [ ∂ x 1 , … , ∂ x n ] R=\R[\partial_{x_1},\ldots,\partial_{x_n}] R = R [ ∂ x 1 , … , ∂ x n ]
Want to solve A f = 0 Af=0 A f = 0 for f = ( f 1 , f 2 , … , f ℓ ) f=(f_1,f_2,\ldots,f_\ell) f = ( f 1 , f 2 , … , f ℓ ) .
Algebraic preprocessing: characteristic frequencies
ker A ( z ) = { 0 } ⇒ z ∈ V = characteristic variety \ker A(z)=\{0\} \Rightarrow z\in V = \text{ characteristic variety} ker A ( z ) = { 0 } ⇒ z ∈ V = characteristic variety
Principle for PDEs
All solutions are linear combination of exponential-polynomial solutions up to approximation.
Let A ∈ R ℓ ′ × ℓ A \in R^{\ell' \times {\ell}} A ∈ R ℓ ′ × ℓ for R = R [ ∂ x 1 , … , ∂ x n ] R=\R[\partial_{x_1},\ldots,\partial_{x_n}] R = R [ ∂ x 1 , … , ∂ x n ]
Want to solve A f = 0 Af=0 A f = 0 for f = ( f 1 , f 2 , … , f ℓ ) f=(f_1,f_2,\ldots,f_\ell) f = ( f 1 , f 2 , … , f ℓ ) .
Algebraic preprocessing: characteristic frequencies
ker A ( z ) = { 0 } ⇒ z ∈ V = characteristic variety \ker A(z)=\{0\} \Rightarrow z\in V = \text{ characteristic variety} ker A ( z ) = { 0 } ⇒ z ∈ V = characteristic variety
Solution Space is the closure of
f ( x ) = ∑ j = 1 M ∑ h = 1 m c j B h ( x , z j ) e z j ⋅ x f(x) = \sum_{j=1}^M\sum_{h=1}^m\textcolor{#b51963}{c_{j}}\textcolor{#0073e6}{B_h(x,z_j)}e^{\textcolor{#0073e6}{z_j}\cdot x} f ( x ) = j = 1 ∑ M h = 1 ∑ m c j B h ( x , z j ) e z j ⋅ x
where z j ∈ V \textcolor{#0073e6}{z_j}\in V z j ∈ V .
Principle for PDEs
All solutions are linear combination of exponential-polynomial solutions up to approximation.
Let A ∈ R ℓ ′ × ℓ A \in R^{\ell' \times {\ell}} A ∈ R ℓ ′ × ℓ for R = R [ ∂ x 1 , … , ∂ x n ] R=\R[\partial_{x_1},\ldots,\partial_{x_n}] R = R [ ∂ x 1 , … , ∂ x n ]
Want to solve A f = 0 Af=0 A f = 0 for f = ( f 1 , f 2 , … , f ℓ ) f=(f_1,f_2,\ldots,f_\ell) f = ( f 1 , f 2 , … , f ℓ ) .
Algebraic preprocessing: characteristic frequencies
ker A ( z ) = { 0 } ⇒ z ∈ V = characteristic variety \ker A(z)=\{0\} \Rightarrow z\in V = \text{ characteristic variety} ker A ( z ) = { 0 } ⇒ z ∈ V = characteristic variety
Solution Space is the closure of
f ( x ) = ∑ j = 1 M ∑ h = 1 m c j B h ( x , z j ) e z j ⋅ x f(x) = \sum_{j=1}^M\sum_{h=1}^m\textcolor{#b51963}{c_{j}}\textcolor{#0073e6}{B_h(x,z_j)}e^{\textcolor{#0073e6}{z_j}\cdot x} f ( x ) = j = 1 ∑ M h = 1 ∑ m c j B h ( x , z j ) e z j ⋅ x
where z j ∈ V \textcolor{#0073e6}{z_j}\in V z j ∈ V .
This is the FUNDAMENTAL PRINCIPLE of Ehrenpreis-Palamodov ('70).
Fundamental Principle for PDEs
Explain Ehrenpreis-Palamodov in more detail:
Let A ∈ R ℓ ′ × ℓ A \in R^{\ell' \times {\ell}} A ∈ R ℓ ′ × ℓ for R = R [ ∂ x 1 , … , ∂ x n ] R=\R[\partial_{x_1},\ldots,\partial_{x_n}] R = R [ ∂ x 1 , … , ∂ x n ] and let Ω ⊆ R n \Omega \subseteq \R^n Ω ⊆ R n be a convex, compact set.
There exist
characteristic varieties { V 1 , … , V s } \textcolor{#0073e6}{\{V_1,\dotsc,V_s\}} { V 1 , … , V s } and
ℓ × 1 \ell\times1 ℓ × 1 polynomial operators { B i , 1 ( x , z ) , … , B i , m i ( x , z ) } i = 1 , … , s \textcolor{#0073e6}{\{B_{i,1}(x, z), \dotsc, B_{i,m_i}(x,z)\}_{i=1,\dotsc,s}} { B i , 1 ( x , z ) , … , B i , m i ( x , z ) } i = 1 , … , s in 2 n 2n 2 n variables,
such that any smooth solution
f : Ω → R ℓ f \colon \Omega \to \R^{\ell} f : Ω → R ℓ to the equation
A f = 0 Af = 0 A f = 0 can be approximated by
f ( x ) = ∑ i = 1 s ∑ j = 1 m i ∑ h c h i j B i , j ( x , z h i j ) e ⟨ x , z h i j ⟩
\begin{align*}
f(x) &= \sum_{i=1}^s \sum_{j=1}^{m_i}\sum_h \textcolor{#b51963}{c_h^{ij}} \textcolor{#0073e6}{B_{i,j}}(x, \textcolor{#0073e6}{z^{ij}_h}) e^{\langle x, \textcolor{#0073e6}{z^{ij}_h} \rangle}
\end{align*}
f ( x ) = i = 1 ∑ s j = 1 ∑ m i h ∑ c h ij B i , j ( x , z h ij ) e ⟨ x , z h ij ⟩
where
z h i j ∈ V i \textcolor{#0073e6}{z^{ij}_h}\in V_i z h ij ∈ V i . This is a NONLINEAR FOURIER SERIES with frequencies NOT from a lattice.
Algebraic precomputations are algorithmic and implemented in Macaulay2.
S-EPGP
Sparse version of EPGP.
Define a GP prior with realizations of the form
f ( x ) = ∑ j = 1 m ∑ i = 1 r C i , j B j ( x , z i , j ) e ⟨ x , z i , j ⟩ = : C T ⋅ ϕ ( x ) ,
\begin{align*}
f(x) = \sum_{j=1}^m \sum_{i=1}^r \textcolor{#b51963}{C_{i,j}} \textcolor{#0073e6}{B_j}(x, \textcolor{#0073e6}{z_{i,j}}) e^{\langle x, \textcolor{#0073e6}{z_{i,j}} \rangle} =: \textcolor{#b51963}{C^T}\cdot \textcolor{#0073e6}{\phi(x)},
\end{align*}
f ( x ) = j = 1 ∑ m i = 1 ∑ r C i , j B j ( x , z i , j ) e ⟨ x , z i , j ⟩ =: C T ⋅ ϕ ( x ) ,
Turn f ( x ) f(x) f ( x ) into a GP: C i , j ∼ N ( 0 , Σ ) \textcolor{#b51963}{C_{i,j} \sim \mathcal{N}\left(0, \Sigma\right)} C i , j ∼ N ( 0 , Σ ) for Σ = diag ( σ i 2 ) \textcolor{#b51963}{\Sigma=\operatorname{diag}(\sigma_{i}^2)} Σ = diag ( σ i 2 )
S-EPGP
Sparse version of EPGP.
Suitable, when integration is not possible.
Define a GP prior with realizations of the form
f ( x ) = ∑ j = 1 m ∑ i = 1 r C i , j B j ( x , z i , j ) e ⟨ x , z i , j ⟩ = : C T ⋅ ϕ ( x ) ,
\begin{align*}
f(x) = \sum_{j=1}^m \sum_{i=1}^r \textcolor{#b51963}{C_{i,j}} \textcolor{#0073e6}{B_j}(x, \textcolor{#0073e6}{z_{i,j}}) e^{\langle x, \textcolor{#0073e6}{z_{i,j}} \rangle} =: \textcolor{#b51963}{C^T}\cdot \textcolor{#0073e6}{\phi(x)},
\end{align*}
f ( x ) = j = 1 ∑ m i = 1 ∑ r C i , j B j ( x , z i , j ) e ⟨ x , z i , j ⟩ =: C T ⋅ ϕ ( x ) ,
Turn f ( x ) f(x) f ( x ) into a GP: C i , j ∼ N ( 0 , Σ ) \textcolor{#b51963}{C_{i,j} \sim \mathcal{N}\left(0, \Sigma\right)} C i , j ∼ N ( 0 , Σ ) for Σ = diag ( σ i 2 ) \textcolor{#b51963}{\Sigma=\operatorname{diag}(\sigma_{i}^2)} Σ = diag ( σ i 2 )
We then get a covariance function of the form
k ( x , x ′ ) = ϕ ( x ) H Σ ϕ ( x ′ )
\begin{align*}
k(x,x') = \textcolor{#0073e6}{\phi(x)}^H \textcolor{#b51963}{\Sigma} \textcolor{#0073e6}{\phi(x')}
\end{align*}
k ( x , x ′ ) = ϕ ( x ) H Σ ϕ ( x ′ )
Wave equation 2D: Initial Condition
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2 \frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} ∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
Wave equation 2D: Initial Speed
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2 \frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} ∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
Wave equation 2D
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2 \frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} ∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
Numerical solution
Mean of our S-EPGP
PINN
First three frames of the numerical solution are the training data, with a 21 × 21 21\times21 21 × 21 spatial grid.
Heat Equation
Comparison to PINN
Cuomo, Di Cola, Giampaolo, Rozza, Raissi, Piccialli; Scientific machine learning through physics-informed neural networks: Where we are and what’s next. Journal of Scientific Computing; arXiv:2201.05624
Maxwell's equations
E ( x , y , z , t ) \mathbf{E}(x,y,z,t) E ( x , y , z , t )
∇ ⋅ E = 0 ∇ ⋅ B = 0 ∇ × E = − ∂ B ∂ t ∇ × B = ∂ E ∂ t
\begin{align*}
\nabla \cdot \mathbf{E} &= 0 \\
\nabla \cdot \mathbf{B} &= 0 \\
\nabla \times \mathbf{E} &= -\frac{\partial \mathbf{B}}{\partial t} \\
\nabla \times \mathbf{B} &= \frac{\partial \mathbf{E}}{\partial t}
\end{align*}
∇ ⋅ E ∇ ⋅ B ∇ × E ∇ × B = 0 = 0 = − ∂ t ∂ B = ∂ t ∂ E
V ( z t 2 − z x 2 − z y 2 − z z 2 ) V(z_t^2-z_x^2-z_y^2-z_z^2) V ( z t 2 − z x 2 − z y 2 − z z 2 )
3D wave equation
B ( x , y , z , t )
\mathbf{B}(x,y,z,t)
B ( x , y , z , t )
Wave equation 2D: Boundary Conditions
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2 \frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} ∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
Dirichlet BC 0 on circle, S-EPGP method
Wave equation 2D: BC, old method
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2 \frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} ∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
Dirichlet BC 0 on rectangle, S-EPGP method
Wave equation 2D: BC, new method
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2 \frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} ∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
Dirichlet BC 0 on rectangle, B -EPGP method
Wave equation 2D
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2 \frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} ∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
Neumann BC 0 on half-space, B -EPGP method
Wave equation 2D
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2 \frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} ∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
Dirichlet BC 0 on triangle, B -EPGP method
Heat equation 2D: IBVP
∂ u ∂ t = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2 \frac{\partial u}{\partial t} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} ∂ t ∂ u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
No BC, Dirichlet BC 0, Neumann BC 0, B -EPGP method
Conclusion
EPGP : A class of GP priors for solving PDE
Fully algorithmic
No approximations anywhere ⇒ \Rightarrow ⇒ exact solutions
Improved accuracy and convergence speed over PINN
Compatible with standard GP techniques
Interpretable hyperparameters (optimizable)
Only assumptions: linear and constant coefficients
Very little data necessary
We can say more for parametrizable/controllable systems
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2
\frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2}
∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
Conclusion
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2
\frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2}
∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
Open Problems
Many
Large scale problems and comparison to numerical solvers
Extensions to neural networks
Bringing these methods to industry
Boundary conditions for S-EPGP
Inverse problems
Convergence and density questions
Topologically interesting domains and monodromy phenomena
Conclusion
Thank you for your attention!
Literature
arXiv:2411.16663
arXiv:2212.14319 (ICML 2023)
arXiv:2208.12515 (NeurIPS 2022)
arXiv:2205.03185 (CASC 2022)
arXiv:2002.00818 (AISTATS 2021)
arXiv:1801.09197 (NeurIPS 2018)
∂ 2 u ∂ t 2 = ∂ 2 u ∂ x 2 + ∂ 2 u ∂ y 2
\frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2}
∂ t 2 ∂ 2 u = ∂ x 2 ∂ 2 u + ∂ y 2 ∂ 2 u
\[
\begin{align*}
\nabla \cdot \mathbf{E} &= 0 \\
\nabla \cdot \mathbf{B} &= 0 \\
\nabla \times \mathbf{E} &= -\frac{\partial \mathbf{B}}{\partial t} \\
\nabla \times \mathbf{B} &= \frac{\partial \mathbf{E}}{\partial t}
\end{align*}
\] Modeling Linear PDE Systems with Probabilistic Machine Learning Markus Lange-Hegermann Institut für industrielle Informationstechnik - inIT Department of Electrical Engineering and Computer Science, TH OWL Bogdan Raiță Department of Mathematics and Statistics Georgetown University With many, many contributions by Andreas Besginow, Marc Härkönen, Jianlei Huang, Xin Li, and Daniel Robertz