In the first blog post of this series, we explored how Physics-Informed Neural Networks (PINNs) combine machine learning and mathematical physics to solve known Partial Differential Equations (PDEs). We focused on how PINNs embed physical laws as constraints into neural networks, providing data-efficient and accurate solutions.
In this second post, we build on that foundation to address an even more ambitious question:
Can PINNs uncover the governing equations of a physical system from sparse and noisy data?
Inspired by the second part of the work by Raissi et al. on the data-driven discovery of PDEs, this post dives into the data-driven discovery of PDEs, where PINNs infer both the solutions and the dynamics of physical systems. Let’s begin by revisiting the PDE framework and contrasting it with the first blog post before advancing into the methodology and examples.
Solving Known PDEs
In the first blog post, we addressed the problem of solving PDEs when their dynamics are explicitly known. The PDE was formulated as:
where:
- u(t, x) is the system state (e.g., wave amplitude, velocity, or temperature).
- N[u] is the nonlinear operator (e.g., advection, diffusion, reaction).
The goal was to approximate u(t, x) across the spatio-temporal domain, ensuring consistency with observed data and the physics embedded in N[u].
Discovering Unknown PDEs
Now, we extend the problem to situations where the governing dynamics N[u; λ] are partially or entirely unknown. The problem becomes:
where:
- N[u; λ] is parameterized by unknown coefficients or its structure is unknown altogether.
- u(t, x) and λ must be inferred simultaneously from data.
The challenge here is dual:
1. Learn the latent state u(t, x) .
2. Discover the unknown parameters λ or structure of N.
Key Difference:
In solving PDEs, we used PINNs to find u(t, x) when N[u]was given. In discovering PDEs, we aim to find both u(t, x) and N[u; λ].
To uncover the dynamics, the PINN is designed to learn both u(t, x) and λ by minimizing a composite loss function.
Defining the Residual
The PDE residual measures how well the neural network approximates the underlying physics at every point in the domain. The residual is central to the PINN framework, as it enforces physical laws during training.
In the case of solving known PDEs, the residual was defined as:
where:
- u(t, x) : Neural network approximation of the solution.
- N[u] : Known nonlinear operator, such as advection, diffusion, or reaction terms.
The residual measures the violation of the given PDE at collocation points. During training:
- The physics loss minimized f(t, x)² to ensure the solution u(t, x) satisfies the known PDE.
- The neural network learned only u(t, x) , while N[u] was fixed and predefined.
The residual here acted as a penalty for deviations from a fixed physical law. The goal was to enforce consistency between the network output and the known dynamics.
For discovering PDEs, the residual becomes:
where:
- u(t, x) : Neural network approximation of the solution.
- N[u; λ] : Unknown operator.
The key difference is that both u(t, x) and λ are now learnable. The residual measures the consistency of the inferred solution u(t, x) with the discovered dynamics N[u; λ]. During training:
- The physics loss minimizes f(t, x; λ)² to discover the structure or parameters of N.
- The neural network learns both u(t, x) and λ simultaneously.
Here, the residual enforces consistency not only between the network output and the observed data but also between the inferred dynamics N[u; λ] and the underlying physics.
The loss function in the first blog post was:
where:
- MSEdata: Measures the error between the network output u(t, x) and observed data.
- MSEphysics: Penalizes violations of the fixed PDE residual f(t, x) .
For discovery, the loss remains similar in structure but shifts in focus:
- MSEdata now ensures that u(t, x) aligns with sparse observations.
- MSEphysics penalizes the residual f(t, x; λ) to ensure consistency with the inferred dynamics.
The critical difference is that MSEphysics now indirectly optimizes λ, guiding the discovery of the operator N[u; λ].
Why This Matters?
The shift in the role of the residual highlights a broader transition in how PINNs are applied:
1. From Solving: Where the residual is a strict constraint enforcing known physics.
2. To Discovering: Where the residual is a flexible constraint that uncovers both the state and the dynamics of the system.
This flexibility makes PINNs a powerful tool for exploring unknown systems where explicit physical laws are unavailable or only partially known.
The Burgers’ equation is a fundamental nonlinear PDE widely used in physics, engineering, and applied mathematics to model phenomena involving advection and diffusion. It appears in contexts such as:
- Fluid Dynamics: Describing the behavior of viscous flows.
- Traffic Flow: Modeling the dynamics of traffic density.
- Shock Wave Formation: Understanding how waves evolve and interact in nonlinear systems.
Its combination of nonlinear advection and diffusion makes it a challenging equation to solve and an excellent benchmark for testing computational methods.
In this example, we will explore how PINNs can discover the governing dynamics of the Burgers’ equation, learning both the solution u(t, x) and the parameters λ_1 and λ_2, directly from sparse and noisy data.
The Burgers’ Equation
The Burgers’ equation is written as:
where:
- u(t, x) : The unknown function representing the state of the system.
- u_t = The time derivative of u .
- u_x =The spatial derivative of u .
- u_xx: The second spatial derivative of u .
- λ_1: Advection coefficient.
- λ_2 : Diffusion coefficient.
Advection captures the nonlinear transport of the system state, while diffusion accounts for the smoothing effect of viscosity or dispersion.
The PINN Framework for Discovery
To uncover the dynamics, the residual is defined as:
- The neural network approximates u(t, x) , while u_t, u_x, and u_xx are computed using automatic differentiation.
- The parameters λ_1 and λ_2 are treated as learnable variables.
Loss Function
The loss function for discovery includes two terms:
1. Data Loss:
Ensures the predicted solution u(t, x) matches the observed data:
2. Physics Loss:
Penalizes violations of the residual across collocation points:
The total loss is the combination of both.
By minimizing this loss, the PINN simultaneously learns:
- The latent solution u(t, x) .
- The parameters λ_1 and λ_2.
Experiment Setup
enforcing no flux at the domain boundaries.
- Data Points: A total of 2,000 randomly sampled points in the spatio-temporal domain, including some with noise.
- Collocation Points: Match the number of observed data points and are sampled uniformly throughout the domain.
Neural Network Architecture
- Structure: A feedforward neural network with 9 hidden layers and 20 neurons per layer.
- Activation Function: tanh , chosen for its smoothness and ability to model nonlinearities.
- Optimizer: L-BFGS, a quasi-Newton optimization method effective for training PINNs.
Results: Parameter Discovery
True Coefficients:
Discovered Coefficients:
- Clean Data: (error < 0.1%)
- 1% Noise: (error < 0.2%).
These results demonstrate the robustness of PINNs, even in the presence of noisy observations.
Solution Reconstruction
The predicted u(t, x) :
- Accurately matches the true solution at all temporal snapshots, including regions with steep gradients.
- Smoothly interpolates between observed data points while adhering to the governing PDE.
Robustness to Noise
The physics loss acts as a regularizer, ensuring the model is robust to noise. For noise levels up to 10%, the parameter error remained below 2%.
In this blog post, we explored the remarkable ability of Physics-Informed Neural Networks (PINNs) to go beyond solving PDEs and tackle the challenging task of discovering unknown dynamics directly from sparse and noisy data. This shift represents a significant leap in the application of machine learning to scientific discovery.
Key Takeaways
1. Unified Framework: PINNs seamlessly integrate observed data and governing physics, enabling the simultaneous learning of solutions and PDE parameters. This unified framework eliminates the need for large datasets, making it highly efficient.
2. Robustness to Noise: By embedding physics as a constraint, PINNs act as natural regularizers, filtering out inconsistencies in noisy data. This robustness is critical for real-world applications where perfect data is rarely available.
3. Interpretability: Unlike purely data-driven models, PINNs reveal the underlying dynamics, providing insights into the structure of the physical system. This interpretability bridges the gap between black-box machine learning models and traditional scientific approaches.
4. Scalability to Real-World Problems: PINNs’ ability to handle complex nonlinear systems, as demonstrated with Burgers’ equation, positions them as a promising tool for applications in fluid mechanics, traffic modeling, and wave dynamics.
As we continue this series, we will go into these advanced topics, further demonstrating the versatility and potential of PINNs in bridging the gap between machine learning and scientific discovery.