ML Estimation of State-Space Models

Kalman Filter Forecasts

The Kalman filter forecasts \(\smash{\hat{\boldsymbol{\xi}}_{t|t-1}}\) and \(\smash{\hat{\boldsymbol{Y}}_{t|t-1}}\) are linear projections of \(\smash{\boldsymbol{\xi}_{t}}\) and \(\smash{\boldsymbol{Y}_{t}}\) on \(\smash{(\boldsymbol{x}_{t},\boldsymbol{\mathcal{Y}}_{t-1})}\).

  • They are optimal among all forecasts that are linear functions of \(\smash{(\boldsymbol{x}_{t},\boldsymbol{\mathcal{Y}}_{t-1})}\).
  • If \(\smash{\boldsymbol{\xi}_{1}}\) and \(\smash{\{\boldsymbol{w}_{t},\boldsymbol{v}_{t}\}_{t=1}^{T}}\) are multivariate Gaussian, \(\smash{\hat{\boldsymbol{\xi}}_{t|t-1}}\) and \(\smash{\hat{\boldsymbol{Y}}_{t|t-1}}\) are optimal among all forecasts that are functions of \(\smash{(\boldsymbol{x}_{t},\boldsymbol{\mathcal{Y}}_{t-1})}\) (linear and non-linear).

Conditional Distribution of \(\smash{\boldsymbol{Y}_t}\)

The distribution of \(\smash{\boldsymbol{Y}_{t}|\boldsymbol{x}_{t},\boldsymbol{\mathcal{Y}}_{t-1}}\) is also multivariate Gaussian, of the form:

\[\smash{\boldsymbol{Y}_{t}|\boldsymbol{x}_{t},\boldsymbol{\mathcal{Y}}_{t-1} \sim MVN(A^{'}\boldsymbol{x}_{t} + H^{'}\hat{\boldsymbol{\xi}}_{t|t-1},H^{'}P_{t|t-1}H +R)}\]

Conditional Distribution of \(\smash{\boldsymbol{Y}_t}\)

Thus, the density function is

\[\begin{split}\begin{align} f_{\boldsymbol{Y}_{t}|\boldsymbol{x}_{t}, \boldsymbol{\mathcal{Y}}_{t-1}} & (\boldsymbol{Y}_{t}| \boldsymbol{x}_{t}, \boldsymbol{\mathcal{Y}}_{t-1},\boldsymbol{\theta}) \\ & = (2\pi)^{-n/2} | H^{'}P_{t|t-1}H + R|^{-1/2} \\ & \hspace{1in} \times \exp\bigg\{-\frac{1}{2}\left(\boldsymbol{Y}_{t} - A^{'}\boldsymbol{x}_{t} - H^{'}\hat{\boldsymbol{\xi}}_{t|t-1}\right) \\ & \hspace{2.5in} \times \left(H^{'}P_{t|t-1}H + R\right)^{-1} \\ & \hspace{3.5in} \times \left(\boldsymbol{Y}_{t} - A^{'}\boldsymbol{x}_{t} - H^{'}\hat{\boldsymbol{\xi}}_{t|t-1}\right)^{'}\bigg\} \end{align}\end{split}\]

where \(\smash{\boldsymbol{\theta}}\) aggregates all known parameters in \(\smash{F, A, H, Q,}\) and \(\smash{R}\).

Log-likelihood

The log-likelihood is the joint density

\[\smash{\ell(\boldsymbol{\theta}) = \sum_{t=1}^{T} \log\left(f_{\boldsymbol{Y}_{t}|\boldsymbol{X}_{t}, \boldsymbol{\mathcal{Y}}_{t-1}} (\boldsymbol{Y}_{t}|\boldsymbol{x}_{t}, \boldsymbol{\mathcal{Y}}_{t-1},\boldsymbol{\theta})\right)}\]
  • The log-likelihood can be maximized numerically with respect to \(\smash{F(\boldsymbol{\theta}), A(\boldsymbol{\theta}), H(\boldsymbol{\theta}), Q(\boldsymbol{\theta})}\), and \(\smash{R(\boldsymbol{\theta})}\).
  • This is an exact log likelihood and yields exact MLEs.
  • Maximum likelihood estimation for \(\smash{MA}\) and \(\smash{ARMA}\) can be performed in this manner.

Basic Prescription

  1. Guess \(\smash{\boldsymbol{\theta}^{(0)}}\)
  1. Given \(\smash{\boldsymbol{\theta}^{(s)}}\), compute \(\smash{F(\boldsymbol{\theta}^{(s)}), A(\boldsymbol{\theta}^{(s)}), H(\boldsymbol{\theta}^{(s)}), Q(\boldsymbol{\theta}^{(s)}),}\) and \(\smash{R(\boldsymbol{\theta}^{(s)})}\).
  1. Use the Kalman Filter to iteratively compute \(\smash{\hat{\boldsymbol{\xi}}_{t|t-1}}\) and \(\smash{P_{t|t-1}}\), \(\smash{t =1,\ldots,T}\).
  1. Compute the log-likelihood using \(\smash{H(\boldsymbol{\theta}^{(s)}), A(\boldsymbol{\theta}^{(s)}), R(\boldsymbol{\theta}^{(s)})}\), and \(\smash{\{\hat{\boldsymbol{\xi}}_{t|t-1},P_{t|t-1}\}_{t=1}^{T}}\).
  1. Use a numerical method to update \(\smash{\boldsymbol{\theta}^{(s+1)}}\).
  1. If \(\smash{||\boldsymbol{\theta}^{(s+1)} - \boldsymbol{\theta}^{(s)}|| < \tau}\), stop. Otherwise, set \(\smash{i = i +1}\) and return to step 2.

Basic Prescription

Updating \(\smash{\boldsymbol{\theta}^{(i)} \rightarrow \boldsymbol{\theta}^{(i+1)}}\) may involve numerical or analytical derivatives.

  • Analytical derivatives of the log likelihood with respect to each \(\smash{\theta_{i}}\) will involve
\[\smash{ \frac{\partial \hat{\boldsymbol{\xi}}_{t|t-1} (\boldsymbol{\theta})}{\partial \theta_{i}} \hspace{4pt} \text{ and } \hspace{4pt} \frac{\partial P_{t|t-1}}{\partial \theta_{i}}}.\]
  • These derivatives can be updated recursively similar to \(\smash{\hat{\boldsymbol{\xi}}_{t|t-1}}\) and \(\smash{P_{t|t-1}}\).