The Kalman Smoother¶

Kalman Equations¶

Recall the basic Kalman equations

\[\begin{split}\begin{align} \hat{\boldsymbol{\xi}}_{t|t} & = \hat{\boldsymbol{\xi}}_{t|t-1} + P_{t|t-1}H(H^{'}P_{t|t-1}H + R)^{-1} \hspace{3pt} (\boldsymbol{Y}_{t}-A^{'}\boldsymbol{x}_{t} + H^{'}\hat{\boldsymbol{\xi}}_{t|t-1}) \\ \hat{\boldsymbol{\xi}}_{t+1|t} & = F \hat{\boldsymbol{\xi}}_{t|t} \\ P_{t|t} & = P_{t|t-1} - P_{t|t-1}H(H^{'}P_{t|t-1}H + R)^{-1}H^{'}P_{t|t-1} \\ P_{t+1|t} & = F P_{t|t} F^{'} + Q. \end{align}\end{split}\]

The State Variable¶

In our development of the Kalman filter we focused attention on forecasting \(\smash{\boldsymbol{\xi}_t}\) or \(\smash{\boldsymbol{Y}_t}\).

However, we might be inherently interested in \(\smash{\boldsymbol{\xi}_t}\) as part of a structural model.
In this case, we consider inference about \(\smash{\boldsymbol{\xi}_t}\) using the full sample of data \(\smash{\boldsymbol{\mathcal{Y}}_{T}}\):

\[\begin{align} \hat{\boldsymbol{\xi}}_{t+1|T} & = \hat{E}[\boldsymbol{\xi}_{t+1}|\boldsymbol{\mathcal{Y}}_{T}]. \end{align}\]

Will we refer to \(\smash{\hat{\boldsymbol{\xi}}_{t|T}}\) as the smoothed estimate of \(\smash{\boldsymbol{\xi}_t}\) with MSE matrix

\[\begin{align} P_{t+1|T} & = E[(\boldsymbol{\xi}_{t}-\hat{\boldsymbol{\xi}}_{t|T})(\boldsymbol{\xi}_{t} - \hat{\boldsymbol{\xi}}_{t|T})^{'}], \end{align}\]

Linear Forecast Update¶

If we observed \(\smash{\boldsymbol{\xi}_{t+1}}\) we could use the linear forecast update equation to obtain

\[\begin{split}\begin{align} \hat{E}[\boldsymbol{\xi}_{t}|\boldsymbol{\xi}_{t+1}, \boldsymbol{\mathcal{Y}}_t] & = \hat{\boldsymbol{\xi}}_{t|t} + E[(\boldsymbol{\xi}_{t} - \hat{\boldsymbol{\xi}}_{t|t}) (\boldsymbol{\xi}_{t+1}-\hat{\boldsymbol{\xi}}_{t+1|t})^{'}] \\ & \hspace{1in} \times E[(\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|t})(\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|t})^{'}]^{-1} \hspace{3pt} (\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|t}) \\ \end{align}\end{split}\]

Linear Forecast Update¶

The first term in the product of the forecast update is

\[\begin{split}\begin{align} E[(\boldsymbol{\xi}_{t}-\hat{\boldsymbol{\xi}}_{t|t})(\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|t})^{'}] & = E[(\boldsymbol{\xi}_{t}-\hat{\boldsymbol{\xi}}_{t|t})(F\boldsymbol{\xi}_{t} + \boldsymbol{v}_{t+1} - F \hat{\boldsymbol{\xi}}_{t|t})^{'}] \\ & = E[(\boldsymbol{\xi}_{t}-\hat{\boldsymbol{\xi}}_{t|t})(\boldsymbol{\xi}_{t} - \hat{\boldsymbol{\xi}}_{t|t})^{'} F^{'}] \\ & = P_{t|t} F^{'}. \end{align}\end{split}\]

We made use of the fact that \(\smash{\boldsymbol{v}_{t+1}}\) is not correlated with \(\smash{\boldsymbol{\xi}_{t}}\) or \(\smash{\hat{\boldsymbol{\xi}}_{t|t}}\).

Thus

\[\begin{align} \hat{E}[\boldsymbol{\xi}_{t}|\boldsymbol{\xi}_{t+1},\boldsymbol{\mathcal{Y}}_t] & = \hat{\boldsymbol{\xi}}_{t|t} + P_{t|t} F^{'} P_{t+1|t}^{-1} (\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|t}). \end{align}\]

Linear Forecast Update with All Data¶

For \(\smash{j>0}\) we can iterate on the state equation to express

\[\begin{align} \boldsymbol{Y}_{t+j} & = A^{'} \boldsymbol{x}_{t+j} + H^{'}(F^{j-1} \boldsymbol{\xi}_{t+1} + F^{j-2} \boldsymbol{v}_{t+2} + \ldots + \boldsymbol{v}_{t+j}) + \boldsymbol{w}_{t+j}. \end{align}\]

Note that the forecast error, \(\smash{\boldsymbol{\xi}_t - \hat{E}[\boldsymbol{\xi}_{t}|\boldsymbol{\xi}_{t+1},\boldsymbol{\mathcal{Y}}_t]}\) is

Uncorrelated with \(\smash{\boldsymbol{\xi}_{t+1}}\) (by the definition of linear projection).
Uncorrelated with \(\smash{\boldsymbol{x}_{t+j}, \boldsymbol{w}_{t+j}, \boldsymbol{v}_{t+j}, \ldots, \boldsymbol{x}_{t+2}}\).

Linear Forecast Update with All Data¶

Thus, \(\smash{\boldsymbol{\xi}_t - \hat{E}[\boldsymbol{\xi}_{t}|\boldsymbol{\xi}_{t+1},\boldsymbol{\mathcal{Y}}_t]}\) is uncorrelated with \(\smash{\boldsymbol{Y}_{t+j}}\) for \(\smash{j>0}\) and

\[\begin{split}\begin{align} \hat{E}[\boldsymbol{\xi}_{t}|\boldsymbol{\xi}_{t+1},\boldsymbol{\mathcal{Y}}_T] & = \hat{E}[\boldsymbol{\xi}_{t}|\boldsymbol{\xi}_{t+1},\boldsymbol{\mathcal{Y}}_t] \\ & = \hat{\boldsymbol{\xi}}_{t|t} + P_{t|t} F^{'} P_{t+1|t}^{-1} (\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|t}). \end{align}\end{split}\]

Smoothed Estimate¶

Note that

\(\smash{\hat{\boldsymbol{\xi}}_{t|t}}\) and \(\smash{\hat{\boldsymbol{\xi}}_{t+1|t}}\) are exact linear functions of \(\smash{\boldsymbol{\mathcal{Y}}_t}\)
\(\smash{P_{t|t} F^{'} P_{t+1|t}^{-1}}\) is a function of the fixed parameters

Thus, \(\smash{\hat{\boldsymbol{\xi}}_{t|t}}\), \(\smash{\hat{\boldsymbol{\xi}}_{t+1|t}}\), and \(\smash{P_{t|t} F^{'} P_{t+1|t}^{-1}}\) can be considered constants relative to \(\smash{\boldsymbol{\mathcal{Y}}_T}\):

\[\begin{split}\begin{align} \hat{E}[\boldsymbol{\xi}_{t}|\boldsymbol{\mathcal{Y}}_T] & = \hat{\boldsymbol{\xi}}_{t|t} + P_{t|t} F^{'} P_{t+1|t}^{-1} (\hat{E}[\boldsymbol{\xi}_{t+1}|\boldsymbol{\mathcal{Y}}_T] - \hat{\boldsymbol{\xi}}_{t+1|t}) \\ \Rightarrow \hat{\boldsymbol{\xi}}_{t|T} & = \hat{\boldsymbol{\xi}}_{t|t} + P_{t|t} F^{'} P_{t+1|t}^{-1} (\hat{\boldsymbol{\xi}}_{t+1|T} - \hat{\boldsymbol{\xi}}_{t+1|t}). \end{align}\end{split}\]

A Convenient Fact¶

For any \(\smash{\tau = 1,\ldots,T}\),

\[\begin{split}\begin{align} E[\boldsymbol{\xi}_{t+1} \hat{\boldsymbol{\xi}}_{t+1|\tau}^{'}] & = E[(\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|\tau} + \hat{\boldsymbol{\xi}}_{t+1|\tau})\hat{\boldsymbol{\xi}}_{t+1|\tau}^{'}] \\ & = E[(\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|\tau}) \hat{\boldsymbol{\xi}}_{t+1|\tau}^{'}] + E[\hat{\boldsymbol{\xi}}_{t+1|\tau} \hat{\boldsymbol{\xi}}_{t+1|\tau}^{'}] \\ & = E[\hat{\boldsymbol{\xi}}_{t+1|\tau} \hat{\boldsymbol{\xi}}_{t+1|\tau}^{'}]. \end{align}\end{split}\]

where the last equality follows because the projection error \(\smash{\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|\tau}}\) is uncorrelated with \(\smash{\hat{\boldsymbol{\xi}}_{t+1|\tau}}\). It follows that

\[\begin{split}\begin{align} -& E[\hat{\boldsymbol{\xi}}_{t+1|T} \hat{\boldsymbol{\xi}}_{t+1|T}^{'}] + E[\hat{\boldsymbol{\xi}}_{t+1|t} \hat{\boldsymbol{\xi}}_{t+1|t}^{'}] \\ & \hspace{1in} = \left\{E[\boldsymbol{\xi}_{t+1} \boldsymbol{\xi}_{t+1}^{'}] - E[\hat{\boldsymbol{\xi}}_{t+1|T} \hat{\boldsymbol{\xi}}_{t+1|T}^{'}]\right\} - \left\{E[\boldsymbol{\xi}_{t+1} \boldsymbol{\xi}_{t+1}^{'}] - E[\hat{\boldsymbol{\xi}}_{t+1|t} \hat{\boldsymbol{\xi}}_{t+1|t}^{'}]\right\} \\ & \hspace{1in} = E[(\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|T})(\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|T})^{'}] - E[(\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|t})(\boldsymbol{\xi}_{t+1} - \hat{\boldsymbol{\xi}}_{t+1|t})^{'}] \\ & \hspace{1in} = P_{t+1|T} - P_{t+1|t}. \end{align}\end{split}\]

Smoothed MSE¶

For notational convenience, let \(\smash{J_t = P_{t|t} F^{'} P_{t+1|t}^{-1}}\). Using the equation for the smoothed estimate, we obtain

\[\begin{split}\begin{align} \boldsymbol{\xi}_t - \hat{\boldsymbol{\xi}}_{t|T} & = \boldsymbol{\xi}_t - \hat{\boldsymbol{\xi}}_{t|t} - J_t (\hat{\boldsymbol{\xi}}_{t+1|T} - \hat{\boldsymbol{\xi}}_{t+1|t}) \\ \Rightarrow \boldsymbol{\xi}_t - \hat{\boldsymbol{\xi}}_{t|T} + J_t \hat{\boldsymbol{\xi}}_{t+1|T} & = \boldsymbol{\xi}_t - \hat{\boldsymbol{\xi}}_{t|t} + J_t \hat{\boldsymbol{\xi}}_{t+1|t}. \end{align}\end{split}\]

Multiplying both side of the equation by their transposes, taking expectations, and applying the convenient fact above:

\[\begin{align} P_{t|T} & = P_{t|t} + J_t (P_{t+1|T}-P_{t+1|t}) J_t^{'}. \end{align}\]

The Prescription¶

1. Forward iterate the Kalman filter to obtain \(\smash{\{\hat{\boldsymbol{\xi}}_{t|t}\}_{t=1}^T}\), \(\smash{\{\hat{\boldsymbol{\xi}}_{t+1|t}\}_{t=0}^{T-1}}\), \(\smash{\{P_{t|t}\}_{t=1}^T}\), and \(\smash{\{P_{t+1|t}\}_{t=0}^{T-1}}\).

2. Set the first smoothed estimate and its MSE matrix to \(\smash{\hat{\boldsymbol{\xi}}_{T|T}}\) and \(\smash{P_{T|T}}\), respectively.

Backward iterate for \(\smash{t=T-1,\ldots,1}\) on the equations

\[\begin{split}\begin{align} \hat{\boldsymbol{\xi}}_{t|T} & = \hat{\boldsymbol{\xi}}_{t|t} + P_{t|t} F^{'} P_{t+1|t}^{-1} (\hat{\boldsymbol{\xi}}_{t+1|T} - \hat{\boldsymbol{\xi}}_{t+1|t}) \\ P_{t|T} & = P_{t|t} + J_t (P_{t+1|T}-P_{t+1|t}) J_t^{'}. \end{align}\end{split}\]