.. slideconf:: :slide_classes: appear =========== Probability =========== Random Variables ================ Suppose :math:`X` is a random variable which can take values :math:`x \in \mathcal{X}`. .. rst-class:: to-build - :math:`X` is a discrete r.v. if :math:`\mathcal{X}` is countable. .. rst-class:: to-build - :math:`p(x)` is the probability of a value :math:`x` and is called the probability mass function. .. rst-class:: to-build - :math:`X` is a continuous r.v. if :math:`\mathcal{X}` is uncountable. .. rst-class:: to-build - :math:`f(x)` is called the probability density function and can be thought of as the probability of a value :math:`x`. Probability Mass Function ========================= For a discrete random variable the *probability mass function* (PMF) is .. math:: p(a) = P(X = a), where :math:`a \in \mathbb{R}`. Probability Density Function ============================ If :math:`B = (a,b)` .. math:: P(X \in B) = P(a \leq X \leq b) = \int_a^b f(x) dx. .. rst-class:: to-build Strictly speaking .. rst-class:: to-build .. math:: P(X = a) = \int_a^a f(x) dx = 0, .. rst-class:: to-build but we may (intuitively) think of :math:`f(a) = P(X=a)`. Properties of Distributions =========================== For discrete random variables .. rst-class:: to-build - :math:`p(x) \geq 0`, :math:`\forall x \in \mathcal{X}`. .. rst-class:: to-build - :math:`\sum_{x\in \mathcal{X}} p(x) = 1`. .. rst-class:: to-build For continuous random variables .. rst-class:: to-build - :math:`f(x) \geq 0`, :math:`\forall x \in \mathcal{X}`. .. rst-class:: to-build - :math:`\int_{x\in \mathcal{X}} f(x)dx = 1`. Cumulative Distribution Function ================================ For discrete random variables the cumulative distribution function (CDF) is .. rst-class:: to-build - :math:`F(a) = P(X \leq a) = \sum_{x \leq a} p(x).` .. rst-class:: to-build For continuous random variables the CDF is .. rst-class:: to-build - :math:`F(a) = P(X \leq a) = \int_{-\infty}^a f(x) dx.` Expected Value ============== For a discrete r.v. :math:`X`, the expected value is .. rst-class:: to-build .. math:: E[X] = \sum_{x\in \mathcal{X}} x p(x). .. rst-class:: to-build For a continuous r.v. :math:`X`, the expected value is .. rst-class:: to-build .. math:: E[X] = \int_{x\in \mathcal{X}} x \, f(x) dx. Expected Value ============== If :math:`Y = g(X)`, then .. rst-class:: to-build - For discrete r.v. :math:`X` .. rst-class:: to-build .. math:: E[Y] = E[g(X)] = \sum_{x \in \mathcal{X}} g(x)p(x). .. rst-class:: to-build .. rst-class:: to-build - For continuous r.v. :math:`X` .. rst-class:: to-build .. math:: E[Y] = E[g(X)] = \int_{x \in \mathcal{X}} g(x)f(x)dx. Properties of Expectation ========================= For random variables :math:`X` and :math:`Y` and constants :math:`a,b \in \mathbb{R}`, the expected value has the following properties (for both discrete and continuous r.v.'s): .. rst-class:: to-build - :math:`E[aX + b] = aE[X] + b.` .. rst-class:: to-build - :math:`E[X + Y] = E[X] + E[Y].` .. rst-class:: to-build Realizations of :math:`X`, denoted by :math:`x`, may be larger or smaller than :math:`E[X]`. .. rst-class:: to-build - If you observed many realizations of :math:`X`, :math:`E[X]` is roughly an average of the values you would observe. Properties of Expectation - Proof ================================= .. math:: E[aX + b] & = \int_{-\infty}^{\infty} (ax+b)f(x) dx \\ & = \int_{-\infty}^{\infty} a x f(x) dx + \int_{-\infty}^{\infty} b f(x) dx \\ & = a \int_{-\infty}^{\infty} x f(x) dx + b \int_{-\infty}^{\infty} f(x) dx \\ & = a\,E[X] + b. Variance ======== Generally speaking, variance is defined as .. math:: Var(X) = E\left[(X - E[X])^2\right]. .. rst-class:: to-build If :math:`X` is discrete: .. rst-class:: to-build .. math:: Var(X) = \sum_{x\in \mathcal{X}} (x - E[X])^2 p(x). .. rst-class:: to-build If :math:`X` is continuous: .. rst-class:: to-build .. math:: Var(X) & = \int_{x\in \mathcal{X}} (x - E[X])^2 f(x) dx Variance ======== Using the properties of expectations, we can show :math:`Var(X) = E[X^2] - E[X]^2`: .. rst-class:: to-build .. math:: Var(X) & = E\left[(X - E[X])^2\right] \\ & = E\left[X^2 - 2XE[X] + E[X]^2\right] \\ & = E[X^2] - 2E[X]E[X] + E[X]^2 \\ & = E[X^2] - E[X]^2. Standard Deviation ================== The standard deviation is simply .. math:: Std(X) = \sqrt{Var(X)}. .. rst-class:: to-build .. rst-class:: to-build - :math:`Std(X)` is in the same units as :math:`X`. .. rst-class:: to-build - :math:`Var(X)` is in units squared. Covariance ========== For two random variables :math:`X` and :math:`Y`, the covariance is generally defined as .. math:: Cov(X,Y) = E\left[(X-E[X])(Y-E[Y])\right] .. rst-class:: to-build Note that :math:`Cov(X,X) = Var(X)`. Covariance ========== Using the properties of expectations, we can show .. math:: Cov(X,Y) = E[XY] - E[X]E[Y]. .. rst-class:: to-build This can be proven in the exact way that we proved .. rst-class:: to-build .. math:: Var(X) = E[X^2] - E[X]^2. .. rst-class:: to-build In fact, note that .. rst-class:: to-build .. math:: Cov(X,X) & = E[XY] - E[X]E[Y] \\ & = E[X^2] - E[X]^2 = Var(X). Properties of Variance ====================== Given random variables :math:`X` and :math:`Y` and constants :math:`a,b \in \mathbb{R}`, .. rst-class:: to-build .. math:: Var(aX + b) = a^2Var(X). .. rst-class:: to-build .. math:: Var(aX+bY) & = a^2Var(X) + b^2Var(Y) \\ & \hspace{3in} + 2abCov(X,Y). .. rst-class:: to-build The latter property can be generalized to .. rst-class:: to-build .. math:: Var\left(\sum_{i=1}^n a_i X_i \right) & = \sum_{i=1}^n a_i^2Var(X_i) \\ & \hspace{1in} + 2 \sum_{i=1}^{n-1} \sum_{j=i+1}^n a_i a_j Cov(X_i, X_j). Properties of Variance - Proof ============================== .. math:: Var&(aX+bY) = E\left[(aX+bY)^2\right] - E\left[aX+bY\right]^2 \\ & = E[a^2X^2 + b^2Y^2 + 2abXY] - \left(aE[X]+bE[Y]\right)^2 \\ & = a^2 E[X^2] + b^2 E[Y^2] + 2abE[XY] \\ & \hspace{1in} - a^2E[X]^2 - b^2E[Y]^2 -2abE[X]E[Y] \\ & = a^2 \left(E[X^2] - E[X]^2\right) + b^2 \left(E[Y^2] - E[Y]^2\right) \\ & \hspace{1.5in} + 2ab \left(E[XY] - E[X]E[Y]\right) \\ & = a^2Var(X) + b^2Var(Y) + 2abCov(X,Y). Properties of Covariance ======================== Given random variables :math:`W`, :math:`X`, :math:`Y` and :math:`Z` and constants :math:`a,b \in \mathbb{R}`, .. rst-class:: to-build .. math:: Cov(X,a) = 0. .. rst-class:: to-build .. math:: Cov(aX,bY) = abCov(X,Y). .. rst-class:: to-build .. math:: Cov(W+X,Y+Z) & = Cov(W,Y) + Cov(W,Z) \\ & \hspace{1.3in}+ Cov(X,Y) + Cov(X,Z). .. rst-class:: to-build The latter two can be generalized to .. rst-class:: to-build .. math:: Cov\left(\sum_{i=1}^n a_i X_i, \sum_{j=1}^m b_j Y_j\right) & = \sum_{i=1}^n \sum_{j=1}^m a_i b_j Cov(X_i, Y_j). Correlation =========== Correlation is defined as .. math:: Corr(X,Y) = \frac{Cov(X,Y)}{Std(X) Std(Y)}. .. rst-class:: to-build - It is fairly easy to show that :math:`-1 \leq Corr(X,Y) \leq 1`. .. rst-class:: to-build - The properties of correlations of sums of random variables follow from those of covariance and standard deviations above. Normal Distribution =================== The normal distribution is often used to approximate the probability distribution of returns. .. rst-class:: to-build - It is a continuous distribution. .. rst-class:: to-build - It is symmetric. .. rst-class:: to-build - It is fully characterized by :math:`\mu` (mean) and :math:`\sigma` (standard deviation) -- i.e. if you only tell me :math:`\mu` and :math:`\sigma`, I can draw every point in the distribution. Normal Density ============== If :math:`X` is normally distributed with mean :math:`\mu` and standard deviation :math:`\sigma`, we write .. math:: X \sim \mathcal{N}(\mu, \sigma). .. rst-class:: to-build The probability density function is .. rst-class:: to-build .. math:: f(x) = \frac{1}{\sqrt{2\pi} \sigma} \exp\left\{\frac{1}{2\sigma^2}(x - \mu)^2\right\}. Normal Distribution =================== From `Wikipedia `_: .. ifslides:: .. image:: /_static/Normal_Distribution_PDF.png :width: 7.5in .. ifnotslides:: .. image:: /_static/Normal_Distribution_PDF.png :width: 6in Standard Normal Distribution ============================ Suppose :math:`X \sim \mathcal{N}(\mu, \sigma)`. .. rst-class:: to-build Then .. rst-class:: to-build .. math:: Z = \frac{X - \mu}{\sigma} .. rst-class:: to-build is a standard normal random variable: :math:`Z \sim \mathcal{N}(0,1)`. .. rst-class:: to-build - That is, :math:`Z` has zero mean and unit standard deviation. .. rst-class:: to-build We can reverse the process by defining .. rst-class:: to-build .. math:: X = \mu + \sigma Z. Standard Normal Distribution - Proof ==================================== .. math:: E[Z] & = E\left[\frac{X - \mu}{\sigma}\right] \\ & = \frac{1}{\sigma} E[X - \mu] \\ & = \frac{1}{\sigma} (E[X] - \mu) \\ & = \frac{1}{\sigma} (\mu - \mu) \\ & = 0. Standard Normal Distribution - Proof ==================================== .. math:: Var(Z) & = Var\left(\frac{X - \mu}{\sigma}\right) \\ & = Var\left(\frac{X}{\sigma} - \frac{\mu}{\sigma}\right) \\ & = \frac{1}{\sigma^2} Var(X) \\ & = \frac{\sigma^2}{\sigma^2} \\ & = 1. Sum of Normals ============== Suppose :math:`X_i \sim \mathcal{N}(\mu_i, \sigma_i)` for :math:`i = 1,\ldots,n`. .. rst-class:: to-build Then if we denote :math:`W = \sum_{i=1}^n X_i` .. rst-class:: to-build .. math:: W \sim \mathcal{N}\left(\sum_{i=1}^n \mu_i, \sqrt{\sum_{i=1}^n \sigma_i^2 + 2\sum_{i=1}^j \sum_{j=1}^n Cov(X_i, X_j)}\right). .. rst-class:: to-build How does this simplify if :math:`Cov(X_i, X_j) = 0` for :math:`i \neq j`? Sample Mean =========== Suppose we don't know the true probabilities of a distribution, but would like to estimate the mean. .. rst-class:: to-build - Given a sample of observations, :math:`\{x_i\}_{i=1}^n`, of random variable :math:`X`, we can estimate the mean by .. rst-class:: to-build .. math:: \hat{\mu} = \frac{1}{n} \sum_{i=1}^n x_i. .. rst-class:: to-build - This is just a simple arithmetic average, or a probability weighted average with equal probabilities: :math:`\frac{1}{n}`. .. rst-class:: to-build - But the true mean is a weighted average using actual (most likely, unequal) probabilities. How do we reconcile this? Sample Mean (Cont.) =================== Given that the sample :math:`\{x_i\}_{i=1}^n` was drawn from the distribution of :math:`X`, the observed values are inherently weighted by the true probabilities (for large samples). .. rst-class:: to-build - More values in the sample will be drawn from the higher probability regions of the distribution. .. rst-class:: to-build - So weighting all of the values equally will naturally give more weight to the higher probability outcomes. Sample Variance =============== Similarly, the sample variance can be defined as .. math:: \hat{\sigma}^2 & = \frac{1}{n-1} \sum_{i=1}^n (x_i - \hat{\mu})^2. .. rst-class:: to-build Notice that we use :math:`\frac{1}{n-1}` instead of :math:`\frac{1}{n}` for the sample average. .. rst-class:: to-build - This is because a simple average using :math:`\frac{1}{n}` underestimates the variability of the data because it doesn't account for extra error involved in estimating :math:`\hat{\mu}`. Other Sample Moments ==================== Sample standard deviations, covariances and correlations are computed in a similar fashion. .. rst-class:: to-build - Use the definitions above, replacing expectations with simple averages. .. header:: ""