.. slideconf:: :slide_classes: appear ============================================================================== Stationarity ============================================================================== Introduction ============================================================================== Time series analysis is concerned with dynamics. .. raw:: - We may have complete knowledge of the *unconditional* distribution of a group of random variables but no understanding of their sequential dynamics. .. raw:: - Time series is focused on understanding the sequential relationship of a group of random variables. .. raw:: - Hence, the focus is *conditional* distributions and *autocovariances*. Time Series ============================================================================== A time series is a stochastic process indexed by time: .. math:: \begin{align*} Y_1, Y_2, Y_3, \ldots, Y_{T-1}, Y_T. \end{align*} .. raw:: - *Stochastic* is a synonym for *random*. .. raw:: - So a time series is a sequence of (potentially different) random variables ordered by time. .. raw:: - We will let lower-case letters denote a realization of a time series. .. math:: \begin{align*} y_1, y_2, y_3, \ldots, y_{T-1}, y_T. \end{align*} Distributions ============================================================================== We will think of :math:`{\bf Y}_T = \{Y_t\}_{t=1}^T` as a random variable in its own right. .. raw:: - :math:`{\bf y}_T = \{y_t\}_{t=1}^T` is a *single* realization of :math:`{\bf Y}_T = \{Y_t\}_{t=1}^T`. .. raw:: - The CDF is :math:`F_{{\bf Y}_T}({\bf y}_T)` and the PDF is :math:`f_{{\bf Y}_T}({\bf y}_T)`. .. raw:: - For example, consider :math:`\smash{T = 100}`: .. math:: \begin{align*} F\left({\bf y}_{100}\right) & = P(Y_1 \leq y_1, \ldots, Y_{100} \leq y_{100}). \end{align*} .. raw:: - Notice that :math:`{\bf Y}_T` is just a collection of random variables and :math:`f_{{\bf Y}_T}({\bf y}_T)` is the joint density. Time Series Observations ============================================================================== As statisticians and econometricians, we want many observations of :math:`{\bf Y}_T` to learn about its distribution: .. math:: \begin{align*} {\bf y}_T^{(1)}, \,\,\,\,\,\, {\bf y}_T^{(2)},& \,\,\,\,\,\, {\bf y}_T^{(3)}, \,\,\,\,\,\, \ldots \end{align*} .. raw:: Likewise, if we are only interested in the marginal distribution of :math:`\smash{Y_{17}}` .. math:: \begin{align*} F_{Y_{17}}(a) = P(Y_{17} \leq a) \end{align*} .. raw:: we want many observations: :math:`\smash{\left\{y_{17}^{(i)}\right\}_{i=1}^N}`. Time Series Observations ============================================================================== Unfortunately, we usually only have *one observation* of :math:`{\bf Y}_T`. .. raw:: - Think of the daily closing price of Harley-Davidson stock since January 2nd. .. raw:: - Think of your cardiogram for the past 100 seconds. .. raw:: In neither case can you repeat history to observe a new sequence of prices or electric heart signals. .. raw:: - In time series econometrics we typically base inference on a single observation. .. raw:: - Additional assumptions about the process will allow us to exploit information in the full sequence :math:`{\bf y}_T` to make inferences about the joint distribution :math:`F_{{\bf Y}_T}({\bf y}_T)`. Moments ============================================================================== Since the stochastic process is comprised of individual random variables, we can consider moments of each: .. math:: \begin{align*} E[Y_t] & = \int_{-\infty}^{\infty} y_t f_{Y_t}(y_t) dy_t = \mu_t \\ Var(Y_t) & = \int_{-\infty}^{\infty} (y_t-\mu_t)^2 f_{Y_t}(y_t) dy_t = \gamma_{0t} \end{align*} Moments ============================================================================== .. math:: \begin{align*} Cov(Y_t, Y_{t-j}) & = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} (y_t-\mu_t)(y_{t-j}-\mu_{t-j}) \\ & \hspace{2in} \times \, f_{Y_t,Y_{t-j}}(y_t,y_{t-j}) dy_tdy_{t-j} = \gamma_{jt}, \end{align*} where :math:`\smash{f_{Y_t}}` and :math:`\smash{f_{Y_t,Y_{t-j}}}` are the marginal distributions of :math:`f_{{\bf Y}_T}` obtained by integrating over the appropriate elements of :math:`{\bf Y}_T`. Autocovariance and Autocorrelation ============================================================================== - :math:`\smash{\gamma_{jt}}` is known as the :math:`\smash{j}` th autocovariance of :math:`\smash{Y_t}` since it is the covariance of :math:`\smash{Y_t}` with its own lagged value. .. raw:: - The :math:`\smash{j}` th autocorrelation of :math:`\smash{Y_t}` is defined as .. math:: \begin{align*} \rho_{jt} & = Corr(Y_t, Y_{t-j}) \\ & = \frac{Cov(Y_t, Y_{t-j})}{\sqrt{Var(Y_t)} \sqrt{Var(Y_{t-j})}} \\ & = \frac{\gamma_{jt}}{\sqrt{\gamma_{0t}} \sqrt{\gamma_{0t-j}}}. \end{align*} Sample Moments ============================================================================== If we had :math:`N` observations :math:`{\bf y}_T^{(1)},\ldots,{\bf y}_T^{(N)}`, we could estimate moments of each (univariate) :math:`\smash{Y_t}` in the usual way: .. math:: \begin{align*} \hat{\mu}_t & = \frac{1}{N} \sum_{i=1}^N y_t^{(i)}. \\ \hat{\gamma}_{0t} & = \frac{1}{N} \sum_{i=1}^N (y_t^{(i)} - \hat{\mu}_t)^2. \\ \hat{\gamma}_{jt} & = \frac{1}{N} \sum_{i=1}^N (y_t^{(i)} - \hat{\mu}_t) (y_{t-j}^{(i)} - \hat{\mu}_{t-j}). \\ \end{align*} Example ============================================================================== Suppose each element of :math:`{\bf Y}_T` is described by .. math:: \begin{align*} Y_t & = \mu_t + \varepsilon_t, \,\,\,\, \varepsilon_t \stackrel{i.i.d.}{\sim} \mathcal{N}(0,\sigma^2_t), \forall t. \end{align*} Example ============================================================================== In this case, .. math:: \begin{align*} \mu_t & = E[Y_t] = \mu_t, \,\,\, \forall t, \\ \gamma_{0t} & = Var(Y_t) = Var(\varepsilon_t) = \sigma^2_t, \,\,\, \forall t \\ \gamma_{jt} & = Cov(Y_t, Y_{t-j}) = Cov(\varepsilon_t, \varepsilon_{t-j}) = 0, \,\,\, \forall t, j \neq 0. \end{align*} .. raw:: - If :math:`\smash{\sigma^2_t = \sigma^2}` :math:`\smash{\forall t}`, :math:`{\bf \varepsilon}_T` is known as a *Gaussian white noise* process. .. raw:: - In this case, :math:`{\bf Y}_T` is a Gaussian white noise process with drift. .. raw:: - :math:`{\bf \mu}_T` is the drift vector. White Noise ============================================================================== Generally speaking, :math:`{\bf \varepsilon}_T` is a *white noise process* if .. math:: \begin{gather*} E[\varepsilon_t] = 0, \,\,\, \forall t \\ E[\varepsilon^2_t] = \sigma^2, \,\,\, \forall t \\ E[\varepsilon_t \varepsilon_{\tau}] = 0, \,\,\, \text{ for } t \neq \tau. \end{gather*} White Noise ============================================================================== Notice there is no distributional assumption for :math:`\varepsilon_t`. .. raw:: - If :math:`\smash{\varepsilon_t}` and :math:`\smash{\varepsilon_{\tau}}` are independent for :math:`\smash{t \neq \tau}`, :math:`{\bf \varepsilon}_T` is *independent white noise*. .. raw:: - Notice that independence :math:`\smash{\Rightarrow E[\varepsilon_t \varepsilon_{\tau}] = 0}`, but :math:`E[\varepsilon_t \varepsilon_{\tau}] = 0 \not \Rightarrow` independence. .. raw:: - If :math:`\smash{\varepsilon_t \sim \mathcal{N}(0, \sigma^2)}` :math:`\forall t`, as in the example above, :math:`{\bf \varepsilon}_T` is Gaussian white noise. Weak Stationarity ============================================================================== Suppose the first and second moments of a stochastic process :math:`{\bf Y}_{T}` don't depend on :math:`\smash{t \in T}`: .. math:: \begin{align*} E[Y_t] & = \mu \,\,\,\, \forall t \\ Cov(Y_t, Y_{t-j}) & = \gamma_j \,\,\,\, \forall t \text{ and any } j. \end{align*} .. raw:: - In this case :math:`{\bf Y}_{T}` is *weakly stationary* or *covariance stationary*. .. raw:: - In the previous example, if :math:`\smash{Y_t = \mu + \varepsilon_t}` :math:`\smash{\forall t}`, :math:`{\bf Y}_{T}` is weakly stationary. .. raw:: - However if :math:`\smash{\mu_t \neq \mu}` :math:`\smash{\forall t}`, :math:`{\bf Y}_{T}` is *not* weakly stationary. Autocorrelation under Weak Stationarity ============================================================================== If :math:`{\bf Y}_{T}` is weakly stationary .. math:: \begin{align*} \rho_{jt} & = \frac{\gamma_{jt}}{\sqrt{\gamma_{0t}} \sqrt{\gamma_{0t-j}}} \\ & = \frac{\gamma_j}{\sqrt{\gamma_0} \sqrt{\gamma_0}} \\ & = \frac{\gamma_j}{\gamma_0} \\ & = \rho_j. \end{align*} .. raw:: - Note that :math:`\smash{\rho_0 = 1}`. Weak Stationarity ============================================================================== Under weak stationarity, autocovariances :math:`\smash{\gamma_j}` only depend on the distance between random variables within a stochastic process: .. math:: \begin{align*} Cov(Y_{\tau}, Y_{\tau-j}) = Cov(Y_t, Y_{t-j}) = \gamma_j. \end{align*} .. raw:: This implies .. math:: \begin{align*} \gamma_{-j} = Cov(Y_{t+j}, Y_t) = Cov(Y_t, Y_{t-j}) = \gamma_j. \end{align*} Weak Stationarity ============================================================================== More generally, .. math:: \begin{align*} \Sigma_{{\bf Y}_T} & = \left[\begin{array}{ccccc} \gamma_0 & \gamma_1 & \cdots & \gamma_{T-2} & \gamma_{T-1} \\ \gamma_1 & \gamma_0 & \cdots & \gamma_{T-3} & \gamma_{T-2} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ \gamma_{T-2} & \gamma_{T-3} & \cdots & \gamma_0 & \gamma_1 \\ \gamma_{T-1} & \gamma_{T-2} & \cdots & \gamma_1 & \gamma_0 \end{array}\right]. \end{align*} Strict Stationarity ============================================================================== :math:`{\bf Y}_{T}` is *strictly stationary* if for any set :math:`\smash{\{j_1, j_2, \ldots, j_n\} \in T}` .. math:: \begin{align*} f_{Y_{j_1},\ldots,Y_{j_n}}(a_1, \ldots, a_n) = f_{Y_{j_1 + \tau},\ldots,Y_{j_n + \tau}}(a_1, \ldots, a_n), \,\,\, \forall \tau. \end{align*} .. raw:: - Strict stationarity means that the joint distribution of any subset of random variables in :math:`{\bf Y}_{T}` is invariant to shifts in time, :math:`\smash{\tau}`. .. raw:: - Strict stationarity :math:`\smash{\Rightarrow}` weak stationarity if the first and second moments of a stochastic process exist. .. raw:: - Weak stationarity :math:`\smash{\not \Rightarrow}` strict stationarity: invariance of first and second moments to time shifts (weak stationarity) does not mean that all higher moments are invariant to time shifts (strict stationarity). Strict Stationarity ============================================================================== If :math:`{\bf Y}_{T}` is Gaussian then weak stationarity :math:`\smash{\Rightarrow}` strict stationarity. .. raw:: - If :math:`{\bf Y}_{T}` is Gaussian, all marginal distributions of :math:`\smash{(Y_{j_1}, \ldots, Y_{j_n})}` are also Gaussian. .. raw:: - Gaussian distributions are fully characterized by their first and second moments. Ergodicity ============================================================================== Given :math:`\smash{N}` identically distributed weakly stationary stochastic processes :math:`\left\{{\bf Y}_{T}\right\}_{i=1}^N`, the *ensemble average* is .. math:: \begin{align*} \frac{1}{N} \sum_{i=1}^N Y_t^{(i)} \stackrel{p}{\to} \mu, \,\,\,\, \forall t. \end{align*} .. raw:: For a single stochastic process, we desire conditions under which the *time average* .. math:: \begin{align*} \frac{1}{T} & \sum_{t=1}^T Y_t \stackrel{p}{\to} \mu. \end{align*} Ergodicity ============================================================================== If :math:`{\bf Y}_{T}` is weakly stationary and .. math:: \begin{align*} \sum_{j=0}^{\infty} & |\gamma_j| < \infty, \end{align*} Then :math:`{\bf Y}_{T}` is *ergodic for the mean* and the time average converges. .. raw:: - The equation above requires that the autocovariances fall to zero sufficiently quickly. .. raw:: - i.e. a *long realization* of :math:`\smash{\{y_t\}}` will have many segments that are uncorrelated and which can be used to approximate an ensemble average. Ergodicity ============================================================================== A weakly stationary process is ergodic for the second moments if .. math:: \begin{align*} \frac{1}{T-j} \sum_{t=j+1}^T & (Y_t - \mu)(Y_{t-j} - \mu) \stackrel{p}{\to} \gamma_j. \end{align*} .. raw:: - Separate conditions exist which cause the equation above to hold. .. raw:: - If :math:`{\bf Y}_{T}` is Gaussian and stationary, then :math:`\smash{\sum_{j=0}^{\infty} |\gamma_j| < \infty}` ensures that :math:`\smash{{\bf Y}_{T}}` is ergodic for all moments.