Stationarity

Time Series

A time series is a stochastic process indexed by time:

\[Y_1, Y_2, Y_3, \ldots, Y_{T-1}, Y_T.\]
  • Stochastic is a synonym for random.
  • So a time series is a sequence of (potentially different) random variables ordered by time.
  • We will let lower-case letters denote a realization of a time series.
\[y_1, y_2, y_3, \ldots, y_{T-1}, y_T.\]

Distributions

We will think of \({\bf Y}_T = \{Y_t\}_{t=1}^T\) as a random variable in its own right.

  • \({\bf y}_T = \{y_t\}_{t=1}^T\) is a single realization of \({\bf Y}_T = \{Y_t\}_{t=1}^T\).
  • The CDF is \(F_{{\bf Y}_T}({\bf y}_T)\) and the PDF is \(f_{{\bf Y}_T}({\bf y}_T)\).
  • For example, consider \(T = 100\):
\[\begin{split}F\left({\bf y}_{100}\right) & = P(Y_1 \leq y_1, \ldots, Y_{100} \leq y_{100}).\end{split}\]
  • Notice that \({\bf Y}_T\) is just a collection of random variables and \(f_{{\bf Y}_T}({\bf y}_T)\) is the joint density.

Time Series Observations

As statisticians and econometricians, we want many observations of \({\bf Y}_T\) to learn about its distribution:

\[{\bf y}_T^{(1)}, \,\,\,\,\,\, {\bf y}_T^{(2)}, \,\,\,\,\,\, {\bf y}_T^{(3)}, \,\,\,\,\,\, \ldots\]

Likewise, if we are only interested in the marginal distribution of \(Y_{17}\)

\[f_{Y_{17}}(a) = P(Y_{17} \leq a)\]

we want many observations: \(\left\{y_{17}^{(i)}\right\}_{i=1}^N\).

Time Series Observations

Unfortunately, we usually only have one observation of \({\bf Y}_T\).

  • Think of the daily closing price of Harley-Davidson stock since January 2nd.
  • Think of your cardiogram for the past 100 seconds.

In neither case can you repeat history to observe a new sequence of prices or electronic heart signals.

  • In time series econometrics we typically base inference on a single observation.
  • Additional assumptions about the process will allow us to exploit information in the full sequence \({\bf y}_T\) to make inferences about the joint distribution \(F_{{\bf Y}_T}({\bf y}_T)\).

Moments

Since the stochastic process is comprised of individual random variables, we can consider moments of each:

\[\begin{split}E[Y_t] & = \int_{-\infty}^{\infty} y_t f_{Y_t}(y_t) dy_t = \mu_t\end{split}\]
\[\begin{split}Var(Y_t) & = \int_{-\infty}^{\infty} (y_t-\mu_t)^2 f_{Y_t}(y_t) dy_t = \gamma_{0t}\end{split}\]
\[\begin{split}Cov(Y_t, Y_{t-j}) & = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} (y_t-\mu_t)(y_{t-j}-\mu_{t-j}) \\ & \hspace{1in} \times \, f_{Y_t,Y_{t-j}}(y_t,y_{t-j}) dy_tdy_{t-j} = \gamma_{jt},\end{split}\]

where \(f_{Y_t}\) and \(f_{Y_t,Y_{t-j}}\) are the marginal distributions of \(f_{{\bf Y}_T}\) obtained by integrating over the appropriate elements of \({\bf Y}_T\).

Autocovariance and Autocorrelation

  • \(\gamma_{jt}\) is known as the \(j\) th autocovariance of \(Y_t\) since it is the covariance of \(Y_t\) with its own lagged value.
  • The \(j\) th autocorrelation of \(Y_t\) is defined as
\[\begin{split}\rho_{jt} & = Corr(Y_t, Y_{t-j}) \enspace\end{split}\]
\[\begin{split}& \qquad \qquad = \frac{Cov(Y_t, Y_{t-j})}{\sqrt{Var(Y_t)} \sqrt{Var(Y_{t-j})}}\end{split}\]
\[\begin{split}& \, = \frac{\gamma_{jt}}{\sqrt{\gamma_{0t}} \sqrt{\gamma_{0t-j}}}.\end{split}\]

Sample Moments

If we had \(N\) observations \({\bf y}_T^{(1)},\ldots,{\bf y}_T^{(N)}\), we could estimate moments of each (univariate) \(Y_t\) in the usual way:

\[\begin{split}\hat{\mu}_t & = \frac{1}{N} \sum_{i=1}^N y_t^{(i)}.\end{split}\]
\[\begin{split}\hat{\gamma}_{0t} & = \frac{1}{N} \sum_{i=1}^N (y_t^{(i)} - \hat{\mu}_t)^2.\end{split}\]
\[\begin{split}\hat{\gamma}_{jt} & = \frac{1}{N} \sum_{i=1}^N (y_t^{(i)} - \hat{\mu}_t) (y_{t-j}^{(i)} - \hat{\mu}_{t-j}).\end{split}\]

Example

Suppose each element of \({\bf Y}_T\) is described by

\[\begin{split}Y_t & = \mu_t + \varepsilon_t, \,\,\,\, \varepsilon_t \sim \mathcal{N}(0,\sigma^2_t), \forall t.\end{split}\]

Example

In this case,

\[\begin{split}\mu_t & = E[Y_t] = \mu_t, \,\,\, \forall t,\end{split}\]
\[\begin{split}\gamma_{0t} & = Var(Y_t) = Var(\varepsilon_t) = \sigma^2_t, \,\,\, \forall t\end{split}\]
\[\begin{split}\gamma_{jt} & = Cov(Y_t, Y_{t-j}) = Cov(\varepsilon_t, \varepsilon_{t-j}) = 0, \,\,\, \forall t, j \neq 0.\end{split}\]
  • If \(\sigma^2_t = \sigma^2\) \(\forall t\), \({\bf \varepsilon}_T\) is known as a Gaussian white noise process.
  • In this case, \({\bf Y}_T\) is a Gaussian white noise process with drift.
  • \({\bf \mu}_T\) is the drift vector.

White Noise

Generally speaking, \({\bf \varepsilon}_T\) is a white noise process if

(1)\[\begin{split}E[\varepsilon_t] & = 0, \,\,\, \forall t\end{split}\]
(2)\[\begin{split}E[\varepsilon^2_t] & = \sigma^2, \,\,\, \forall t\end{split}\]
(3)\[\begin{split}E[\varepsilon_t \varepsilon_{\tau}] & = 0, \,\,\, \text{ for } t \neq \tau.\end{split}\]

White Noise

Notice there is no distributional assumption for \(\varepsilon_t\).

  • If \(\varepsilon_t\) and \(\varepsilon_{\tau}\) are independent for \(t \neq \tau\), \({\bf \varepsilon}_T\) is independent white noise.
  • Notice that independence \(\Rightarrow\) Equation (3), but Equation (3) does not \(\Rightarrow\) independence.
  • If \(\varepsilon_t \sim \mathcal{N}(0, \sigma^2)\) \(\forall t\), as in the example above, \({\bf \varepsilon}_T\) is Gaussian white noise.

Weak Stationarity

Suppose the first and second moments of a stochastic process \({\bf Y}_{T}\) don’t depend on \(t \in T\):

\[\begin{split}E[Y_t] & = \mu \,\,\,\, \forall t\end{split}\]
\[\begin{split}Cov(Y_t, Y_{t-j}) & = \gamma_j \,\,\,\, \forall t \text{ and any } j.\end{split}\]
  • In this case \({\bf Y}_{T}\) is weakly stationary or covariance stationary.
  • In the previous example, if \(Y_t = \mu + \varepsilon_t\) \(\forall t\), \({\bf Y}_{T}\) is weakly stationary.
  • However if \(\mu_t \neq \mu\) \(\forall t\), \({\bf Y}_{T}\) is not weakly stationary.

Autocorrelation under Weak Stationarity

If \({\bf Y}_{T}\) is weakly stationary

\[\begin{split}\rho_{jt} & = \frac{\gamma_{jt}}{\sqrt{\gamma_{0t}} \sqrt{\gamma_{0t-j}}}\end{split}\]
\[\begin{split}& = \frac{\gamma_j}{\sqrt{\gamma_0} \sqrt{\gamma_0}}\end{split}\]
\[\begin{split}& = \frac{\gamma_j}{\gamma_0} \qquad\end{split}\]
\[\begin{split}& = \rho_j. \qquad \,\end{split}\]
  • Note that \(\rho_0 = 1\).

Weak Stationarity

Under weak stationarity, autocovariances \(\gamma_j\) only depend on the distance between random variables within a stochastic process:

\[Cov(Y_{\tau}, Y_{\tau-j}) = Cov(Y_t, Y_{t-j}) = \gamma_j.\]

This implies

\[\gamma_{-j} = Cov(Y_{t+j}, Y_t) = Cov(Y_t, Y_{t-j}) = \gamma_j.\]

Weak Stationarity

More generally,

\[\begin{split}\Sigma_{{\bf Y}_T} & = \left[\begin{array}{ccccc} \gamma_0 & \gamma_1 & \cdots & \gamma_{T-2} & \gamma_{T-1} \\ \gamma_1 & \gamma_0 & \cdots & \gamma_{T-3} & \gamma_{T-2} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ \gamma_{T-2} & \gamma_{T-3} & \cdots & \gamma_0 & \gamma_1 \\ \gamma_{T-1} & \gamma_{T-2} & \cdots & \gamma_1 & \gamma_0 \end{array}\right].\end{split}\]

Strict Stationarity

\({\bf Y}_{T}\) is strictly stationary if for any set \(\{j_1, j_2, \ldots, j_n\} \in T\)

\[f_{Y_{j_1},\ldots,Y_{j_nn}}(a_1, \ldots, a_n) = f_{Y_{j_1 + \tau},\ldots,Y_{j_nn + \tau}}(a_1, \ldots, a_n), \,\,\, \forall \tau.\]
  • Strict stationarity means that the joint distribution of any subset of random variables in \({\bf Y}_{T}\) is invariant to shifts in time, \(\tau\).
  • Strict stationarity \(\Rightarrow\) weak stationarity if the first and second moments of a stochastic process exist.
  • Weak stationarity does note \(\Rightarrow\) strict stationarity: invariance of first and second moments to time shifts (weak stationarity) does not mean that all higher moments are invariant to time shifts (strict stationarity).

Strict Stationarity

If \({\bf Y}_{T}\) is Gaussian then weak stationarity \(\Rightarrow\) strict stationarity.

  • If \({\bf Y}_{T}\) is Gaussian, all marginal distributions of \((Y_{j_1}, \ldots, Y_{j_n})\) are also Gaussian.
  • Gaussian distributions are fully characterized by their first and second moments.