ML Estimation of State-Space Models¶
Kalman Filter Forecasts¶
The Kalman filter forecasts \(\smash{\hat{\boldsymbol{\xi}}_{t|t-1}}\) and \(\smash{\hat{\boldsymbol{Y}}_{t|t-1}}\) are linear projections of \(\smash{\boldsymbol{\xi}_{t}}\) and \(\smash{\boldsymbol{Y}_{t}}\) on \(\smash{(\boldsymbol{x}_{t},\boldsymbol{\mathcal{Y}}_{t-1})}\).
- They are optimal among all forecasts that are linear functions of \(\smash{(\boldsymbol{x}_{t},\boldsymbol{\mathcal{Y}}_{t-1})}\).
- If \(\smash{\boldsymbol{\xi}_{1}}\) and \(\smash{\{\boldsymbol{w}_{t},\boldsymbol{v}_{t}\}_{t=1}^{T}}\) are multivariate Gaussian, \(\smash{\hat{\boldsymbol{\xi}}_{t|t-1}}\) and \(\smash{\hat{\boldsymbol{Y}}_{t|t-1}}\) are optimal among all forecasts that are functions of \(\smash{(\boldsymbol{x}_{t},\boldsymbol{\mathcal{Y}}_{t-1})}\) (linear and non-linear).
Conditional Distribution of \(\smash{\boldsymbol{Y}_t}\)¶
The distribution of \(\smash{\boldsymbol{Y}_{t}|\boldsymbol{x}_{t},\boldsymbol{\mathcal{Y}}_{t-1}}\) is also multivariate Gaussian, of the form:
\[\smash{\boldsymbol{Y}_{t}|\boldsymbol{x}_{t},\boldsymbol{\mathcal{Y}}_{t-1} \sim
MVN(A^{'}\boldsymbol{x}_{t} + H^{'}\hat{\boldsymbol{\xi}}_{t|t-1},H^{'}P_{t|t-1}H
+R)}\]
Conditional Distribution of \(\smash{\boldsymbol{Y}_t}\)¶
Thus, the density function is
\[\begin{split}\begin{align}
f_{\boldsymbol{Y}_{t}|\boldsymbol{x}_{t},
\boldsymbol{\mathcal{Y}}_{t-1}} & (\boldsymbol{Y}_{t}|
\boldsymbol{x}_{t},
\boldsymbol{\mathcal{Y}}_{t-1},\boldsymbol{\theta}) \\
& = (2\pi)^{-n/2} | H^{'}P_{t|t-1}H + R|^{-1/2} \\
& \hspace{1in} \times
\exp\bigg\{-\frac{1}{2}\left(\boldsymbol{Y}_{t} - A^{'}\boldsymbol{x}_{t} -
H^{'}\hat{\boldsymbol{\xi}}_{t|t-1}\right) \\
& \hspace{2.5in} \times \left(H^{'}P_{t|t-1}H +
R\right)^{-1} \\
& \hspace{3.5in} \times \left(\boldsymbol{Y}_{t} - A^{'}\boldsymbol{x}_{t} -
H^{'}\hat{\boldsymbol{\xi}}_{t|t-1}\right)^{'}\bigg\}
\end{align}\end{split}\]
where \(\smash{\boldsymbol{\theta}}\) aggregates all known parameters in \(\smash{F, A, H, Q,}\) and \(\smash{R}\).
Log-likelihood¶
The log-likelihood is the joint density
\[\smash{\ell(\boldsymbol{\theta}) = \sum_{t=1}^{T}
\log\left(f_{\boldsymbol{Y}_{t}|\boldsymbol{X}_{t},
\boldsymbol{\mathcal{Y}}_{t-1}}
(\boldsymbol{Y}_{t}|\boldsymbol{x}_{t},
\boldsymbol{\mathcal{Y}}_{t-1},\boldsymbol{\theta})\right)}\]
- The log-likelihood can be maximized numerically with respect to \(\smash{F(\boldsymbol{\theta}), A(\boldsymbol{\theta}), H(\boldsymbol{\theta}), Q(\boldsymbol{\theta})}\), and \(\smash{R(\boldsymbol{\theta})}\).
- This is an exact log likelihood and yields exact MLEs.
- Maximum likelihood estimation for \(\smash{MA}\) and \(\smash{ARMA}\) can be performed in this manner.
Basic Prescription¶
- Guess \(\smash{\boldsymbol{\theta}^{(0)}}\)
- Given \(\smash{\boldsymbol{\theta}^{(s)}}\), compute \(\smash{F(\boldsymbol{\theta}^{(s)}), A(\boldsymbol{\theta}^{(s)}), H(\boldsymbol{\theta}^{(s)}), Q(\boldsymbol{\theta}^{(s)}),}\) and \(\smash{R(\boldsymbol{\theta}^{(s)})}\).
- Use the Kalman Filter to iteratively compute \(\smash{\hat{\boldsymbol{\xi}}_{t|t-1}}\) and \(\smash{P_{t|t-1}}\), \(\smash{t =1,\ldots,T}\).
- Compute the log-likelihood using \(\smash{H(\boldsymbol{\theta}^{(s)}), A(\boldsymbol{\theta}^{(s)}), R(\boldsymbol{\theta}^{(s)})}\), and \(\smash{\{\hat{\boldsymbol{\xi}}_{t|t-1},P_{t|t-1}\}_{t=1}^{T}}\).
- Use a numerical method to update \(\smash{\boldsymbol{\theta}^{(s+1)}}\).
- If \(\smash{||\boldsymbol{\theta}^{(s+1)} - \boldsymbol{\theta}^{(s)}|| < \tau}\), stop. Otherwise, set \(\smash{i = i +1}\) and return to step 2.
Basic Prescription¶
Updating \(\smash{\boldsymbol{\theta}^{(i)} \rightarrow \boldsymbol{\theta}^{(i+1)}}\) may involve numerical or analytical derivatives.
- Analytical derivatives of the log likelihood with respect to each \(\smash{\theta_{i}}\) will involve
\[\smash{ \frac{\partial \hat{\boldsymbol{\xi}}_{t|t-1}
(\boldsymbol{\theta})}{\partial \theta_{i}} \hspace{4pt} \text{
and } \hspace{4pt} \frac{\partial P_{t|t-1}}{\partial
\theta_{i}}}.\]
- These derivatives can be updated recursively similar to \(\smash{\hat{\boldsymbol{\xi}}_{t|t-1}}\) and \(\smash{P_{t|t-1}}\).