Mathematical Theory

Denoising Diffusion Probabilistic Models (DDPM)

VFDIFF is built upon the DDPM framework (Ho et al., 2020), which learns to generate data by progressively denoising a random signal. We adapt this for 2D vector fields $\mathbf{x} \in \mathbb{R}^{H \times W \times 2}$.

The process involves two Markov chains: a forward process that adds noise, and a learned reverse process that removes it.

Forward Process (Diffusion)

The forward process $q(\mathbf{x}_t | \mathbf{x}_{t-1})$ gradually adds Gaussian noise to the data according to a variance schedule $\beta_t$:

$$ q(\mathbf{x}_t | \mathbf{x}_{t-1}) = \mathcal{N}(\mathbf{x}_t; \sqrt{1-\beta_t}\mathbf{x}_{t-1}, \beta_t \mathbf{I}) $$

We can sample $\mathbf{x}_t$ directly from the initial data $\mathbf{x}_0$ using the closed-form property:

$$ \mathbf{x}_t = \sqrt{\bar{\alpha}_t}\mathbf{x}_0 + \sqrt{1-\bar{\alpha}_t}\boldsymbol{\epsilon} $$

Where $\alpha_t = 1 - \beta_t$, $\bar{\alpha}_t = \prod_{s=1}^t \alpha_s$, and $\boldsymbol{\epsilon} \sim \mathcal{N}(\mathbf{0}, \mathbf{I})$.

Reverse Process (Generation)

The generative process $p_\theta(\mathbf{x}_{t-1} | \mathbf{x}_t)$ learns to invert the diffusion. A neural network $\boldsymbol{\epsilon}_\theta(\mathbf{x}_t, t)$ is trained to predict the noise added at each step.

$$ \mathbf{x}_{t-1} = \frac{1}{\sqrt{\alpha_t}} \left( \mathbf{x}_t - \frac{1-\alpha_t}{\sqrt{1-\bar{\alpha}_t}} \boldsymbol{\epsilon}_\theta(\mathbf{x}_t, t) \right) + \sigma_t \mathbf{z} $$

By starting with pure noise $\mathbf{x}_T \sim \mathcal{N}(\mathbf{0}, \mathbf{I})$ and iteratively applying this denoising step, we generate a high-fidelity vector field $\mathbf{x}_0$.

Mathematial Notation

Symbol	Description
$\mathbf{x}_0$	Original clean data (ground truth vector field).
$\mathbf{x}_t$	Noisy data at timestep $t$.
$\beta_t$	Variance schedule parameter (how much new noise is added).
$\alpha_t$	Defined as $1 - \beta_t$.
$\bar{\alpha}_t$	Cumulative product $\prod_{s=1}^t \alpha_s$. Defines signal-to-noise ratio.
$\boldsymbol{\epsilon}$	Standard Gaussian noise $\mathcal{N}(\mathbf{0}, \mathbf{I})$.
$\boldsymbol{\epsilon}_\theta$	Neural network predicting the noise component.

Comparison Baseline: SVD

We compare our deep learning approach against Streamline Vector Diffusion (SVD), a classical numerical method.

SVD minimizes an energy functional $E = \iint \mu |\nabla \mathbf{v}|^2 + \dots$ to propagate flow from sparse streamlines. While effective for simple interpolation, it lacks the ability to hallucinate complex non-linear structures learned by our DDPM.