Derive Kalman filter using belief propagation

Here I will derive Kalman filter from belief propagation (BP). While I won’t derive exactly the closed-form expression as Kalman filter, one can verify numerically that the BP procedure below is the same as the standard Kalman filter. For derivation to the exact same expression, please see here.

Kalman filter is a special case of Gaussian BP. It can be easily understood as three different propagation steps as below.

Scenario 1: Passing message from $X$ to $Y$ for $Y=AX$

Say $X \sim \mathcal{N}(\mu_X, \Sigma_X)$ , $E[Y] = E[AX] = A\mu_X$ . And as

$\Sigma_Y=E[(Y-\mu_Y)(Y-\mu_Y)^\top]=E[(AX-A\mu_X)(AX - A\mu_X)^\top]$

$=E[A(X-\mu_X)(X-\mu_X)^\top A^\top]=A\Sigma_X A^\top$ .

Therefore, $Y \sim \mathcal{N}(A \mu_X, A\Sigma_X A^\top)$ .

Scenario 2: Passing message from $Y$ back to $X$ for $Y=AX$

This is a bit trickier but still quite simple. Let the covariance of $Y$ be $\Sigma_Y$ . And from the perspective of $Y$ ,

$p(y) = \mathcal{N}(y; Ax, \Sigma_Y)$

$\sim \exp( -\frac{1}{2} (y-Ax)^\top \Sigma_Y ^{-1} (y-Ax))$

$\sim \exp( -\frac{1}{2} x^\top A^\top \Sigma_Y^{-1} A x - x^\top A^\top \Sigma_Y^{-1} y)$

$\sim \exp\left( -\frac{1}{2} (x-(A^\top \Sigma_Y^{-1} A)^{-1}A^\top \Sigma_Y^{-1} y)^\top (A^\top \Sigma_Y^{-1} A) (x-(A^\top \Sigma_Y^{-1} A)^{-1}A^\top \Sigma_Y^{-1} y)\right)$

$\sim \mathcal{N}(x;(A^\top \Sigma_Y^{-1} A)^{-1}A^\top \Sigma_Y^{-1} y, (A^\top \Sigma_Y^{-1} A)^{-1})$

Thus, $X \sim \mathcal{N}((A^\top \Sigma_Y^{-1} A)^{-1}A^\top \Sigma_Y^{-1} y, (A^\top \Sigma_Y^{-1} A)^{-1})$

Scenario 3: Combining two independent Gaussian messages

Say one message votes for $\mu_1$ and precision $\Lambda_1$ (precision is the inverse of the covariance matrix). And say another message votes for $\mu_2$ and precision $\Lambda_2$ .

Since

$\mathcal{N}(x; \mu_1, \Lambda_1^{-1}) \mathcal{N}(x;\mu_2,\Lambda_2^{-1})$

$\sim \exp(\frac{1}{2}(x-\mu_1)^\top \Lambda_1(x-\mu_1)-\frac{1}{2}(x-\mu_2)^\top\Lambda_2(x-\mu_2))$

$\sim \exp(\frac{1}{2}x^\top(\Lambda_1+\Lambda_2)x -x^\top (\Lambda_1 \mu_1 + \Lambda_2\mu_2))$

$\sim \exp(-\frac{1}{2} (x - (\Lambda_1+\Lambda_2)^{-1}(\Lambda_1 \mu_1 + \Lambda_2 \mu_2))^\top (\Lambda_1+\Lambda_2)(x - (\Lambda_1+\Lambda_2)^{-1}(\Lambda_1 \mu_1 + \Lambda_2 \mu_2))$

Thus, $X \sim \mathcal{N}((\Lambda_1+\Lambda_2)^{-1}(\Lambda_1 \mu_1 + \Lambda_2 \mu_2), (\Lambda_1+\Lambda_2)^{-1})$

BP for Kalman update

Let’s utilize the above to derive the Kalman update rules. We will use almost exactly the same notation as the wiki here. We only replace $\hat{\bf x}_{k-1|k-1}$ by $\hat{\bf x}_{k-1}$ and ${\bf P}_{k-1|k-1}$ by ${\bf P}_{k-1}$ to reduce cluttering slightly.

As shown in the figure above, given ${\bf x}_{k-1} \sim \mathcal{N}(\hat{\bf x}_{k-1}, {\bf Q}_k)$ and $\tilde{\bf x}_k={\bf F}_k {\bf x}_k + {\bf B}_k {\bf u}_k$ , we have as in (1)
$\tilde{\bf x}_k \sim \mathcal{N}({\bf F}_k \hat{\bf x}_{k-1} + {\bf B}_k {\bf u}_k, {\bf F}_k {\bf P}_{k-1}{\bf F}_k^\top)$

as it is almost identical to Scenario 1 besides a trivial constant shift of ${\bf B}_k {\bf u}_k$ .
Then step (2) is obvious as, $\hat{\bf x}_k$ is nothing but $\tile{\bf x}_k$ added by a noise of covariance ${\bf Q}_k$ .
Now, step (3) is just a direct application of Scenario 2.

Finally, we should combine the estimate from (2) and (3) as independent information using the result in Scenario 3. This gives us the final a posterior estimate of ${\bf x}_k$ as

$\mathcal{N}\left(\Sigma({\bf P}_{k|k-1}^{-1} \hat{\bf x}_{k|k-1} +{\bf H}_k^\top {\bf R}_k^{-1} {\bf z}_k),\underset{\Sigma}{\underbrace{({\bf P}^{-1}_{k|k-1}+{\bf H}^\top_k {\bf R}^{-1}_k {\bf H}_k)^{-1}}}\right)$

Although it is not easy to show immediately, one can verify numerically that the $\Sigma$ and the mean above are indeed ${\bf P}_{k|k}$ and $\hat{\bf x}_{k|k}$ in the original formulation, respectively.