Note/ 算子学习

Published: September 06, 2024

从DeepONet到几种Neural Operator

DeepONet

作者的pre

function：$\mathbb R ^{d_1} \mapsto \mathbb R^{d_2}$

operator: funtion to function $\mathcal G(u) \mapsto \mathcal G(u)(y)\in \mathbb R$, function U generalize to other functions

example: derivative, integral, dynamic system, forcing to solution, …

there’s universal approx theorem for OPERATORS (Chen & Chen 1995)

DeepONet: input u & y,mapsto $\mathcal G(u)(y)$

pretrain w/ deepONet, then refien w/ PINNs

Brunton教程

input1: value of input functions $u(x_1), u(x_2), …$ -> BranchNet

input2: locations $t_1, t_2, …$ TrunkNet

branch和trunk维度不一定相同，branch和location不一定是trunk

具体来说考虑一个一维非线性动力系统:

$\dfrac{du}{dt}=f(u,t)$

其中 u(t)是系统状态, f(u,t)是非线性函数。

我们的目标是使用DeepONet来学习这个动力系统的解算子,即给定初始条件 $u(0)=u_0$ 和时间 t,预测系统状态 u(t)。

DeepONet 架构:

Branch 网络: 输入为初始条件 u0
Trunk 网络: 输入为时间 t
输出: 系统状态 u(t)

Zongyi's video

Neural Operator

例如输入椭圆方程的forcing，输出它的解；这个问题的具体实现上sitll formulate as I2I

但本质上并不依赖于mesh/resolution，输入分辨率高解的分辨率就会高

Intuition：Green’s function，说明可以通过卷积的方式求解u，Green function依赖于输入，用一个神经网络来近似；对更难更非线性的问题，将这个过程用很多层来实现；每一层都是用卷积+bias来近似一个格林函数

整体结构：交替1.non-local线性层（卷积）2.local非线性激活函数；可以再加上encoder-decoder

可以是任意格点，包括不规则格点，可以构建合适的kernel

Graph-based operator GNO

对上述积分离散化

卷积kernel -> 邻接矩阵积分 -> 消息传递

不断加入新的node

推广：multi-level multi-res graph – 或许就是graphcast的前身

FNO

傅里叶变换比分立的卷积核更global

卷积 -> 傅里叶domain的线性变换 -> $v_{out} = \mathcal F ^{-1} (R(\mathcal F(v_{in})))$

高频截断，但因为非线性函数和decoder，仍然可以恢复高频信息

并联的线性层：记录位置和非周期性的信息

其他应用：因为求解更快，可以更快地用MCMC来从结果反演初始条件

Thus Fourier Layer is thought as substitute to conv layer

需要是规则mesh，但是mesh-free，所以适合超分

扩展：LaplacianNO（换kernel），AFNO（加入transfromer）

Sparse convolution as NO

sparse tensor & networks

sparse tensor：大部分都是0，所以只保存非零的{坐标，值}pair

降低了存储和计算的消耗

用于物体、点云的表示之类。

sparse-conv：只在有值的位置上计算卷积，也就是格点不变，valid mask没有扩大而是不变，普通卷积则会慢慢扩大valid grid

mesh-free：sparse conv只需要输入配对的位置和值，而位置是可以固定的；

因此Sparse convolution as NO 的好处是可以处理不规则结构或网格

dense conv with sparse operators used for resolution changes: 对采样成规则格点的数据，依然采用sparse算法来处理卷积，但最终达到的效果在当前格点上等价于dense conv（普通卷积操作）

施文