LOADING

進度條正在跑跑中

統計學(二)


聯合分布

定義

兩個隨機變數 X,YX, Y 的聯合分佈函數 F(x,y)F(x, y) 定義為: F(x,y)=P(Xx,Yy)F(x, y) = P(X \leq x, Y \leq y)

  • cdf給出了 (X,Y)(X, Y) 落在矩形區域的機率

    如圖所示,(X,Y)(X, Y) 落入圖 3.23.2 中的概率是 P(a<Xb,c<Yd)=F(b,d)F(a,d)F(b,c)+F(a,c)P(a < X \leq b, c < Y \leq d) = F(b, d) - F(a, d) - F(b, c) + F(a, c)

離散

  • 聯合概率質量函數 p(x,y)p(x, y) 定義為: p(x,y)=P(X=x,Y=y)p(x, y) = P(X = x, Y = y)
  • 邊際概率質量函數: pX(x)=yp(x,y)p_X(x) = \sum_{y} p(x, y), pY(y)=xp(x,y)p_Y(y) = \sum_{x} p(x, y)
    • pX(x)p_X(x) 給出了 XX 的概率質量函數
    • pY(y)p_Y(y) 給出了 YY 的概率質量函數

連續

  • 聯合概率密度函數 f(x,y)f(x, y) 定義為: f(x,y)=2F(x,y)xy\displaystyle f(x, y) = \frac{\partial^2 F(x, y)}{\partial x \partial y}

和與商

Suppose that X,YX, Y are two random variables with joint pdf f(x,y)f(x, y), then the pmf & cdf of Z=X+YZ = X + Y is given by:

  • pmf
    pZ(z)=P(Z=z)=P(X+Y=z)=xP(X=x,Y=zx)=xf(x,zx)p_Z(z) = P(Z = z) = P(X + Y = z) = \sum_{x} P(X = x, Y = z - x) = \sum_{x} f(x, z - x)
  • cdf
    If X,YX, Y are independent, then:
    FZ(z)=P(Zz)=P(X+Yz)=zxf(x,y)dydx=fX(x)FY(zx)dxF_Z(z) = P(Z \leq z) = P(X + Y \leq z) = \int_{-\infty}^{\infty} \int_{-\infty}^{z - x} f(x, y) dy dx = \int_{-\infty}^{\infty} f_X(x) F_Y(z - x) dx

Quotient of two continuous random variables
Let X,YX, Y be two continuous random variables with joint density f(x,y)f(x, y), then the pdf & cdf of Z=X/YZ = X / Y is given by:

  • cdf
    FZ(z)=P(Zz)=P(X/Yz)=Af(x,y)dxdyF_Z(z) = P(Z \leq z) = P(X / Y \leq z) = \int_{}^{} \int_{A}^{} f(x, y) dx dy
  • pdf

e.g.
XN(0,1)X \sim N(0, 1), YN(0,1)Y \sim N(0, 1), X,YX, Y are independent, then Z=X/YZ = X / Y is called the Cauchy distribution

fZ(z)=1π(1+z2)\displaystyle f_Z(z) = \frac{1}{\pi(1 + z^2)}

FZ(z)=x12πex22e(xz)22dx=x2πex22(1+z2)dx=0x1πex22(1+z2)dx=1z2+1212π0z2+12e(u)z2+12du=12π(z2+12)=1π(z2+1)\begin{aligned} F_Z(z) &= \int_{-\infty}^{\infty} |x| \frac{1}{2\pi} e^{-\frac{x^2}{2}} e^{-\frac{(xz)^2}{2}} dx \\ &= \int_{-\infty}^{\infty} \frac{|x|}{2\pi} e^{-\frac{x^2}{2}(1+z^2)} dx \\ &= \int_{0}^{\infty} x\frac{1}{\pi} e^{-\frac{x^2}{2}(1+z^2)} dx \\ &= \frac{1}{\frac{z^{2}+1}{2}} \frac{1}{2\pi} \int_{0}^{\infty} \frac{z^{2}+1}{2} e^{(-u)\frac{z^{2}+1}{2}} du \\ &= \frac{1}{2\pi (\frac{z^{2}+1}{2})} \\ &= \frac{1}{\pi(z^{2}+1)} \end{aligned}

順序統計量

  • 順序統計量:X(1)X(2)...X(n)X_{(1)} \leq X_{(2)} \leq ... \leq X_{(n)}
  • X(1)X_{(1)} 稱為最小順序統計量,X(n)X_{(n)} 稱為最大順序統計量
  • X(k)X_{(k)} 稱為第 kk 順序統計量
  • X(k)X_{(k)} 的概率密度函數: fX(k)(x)=n!(k1)!(nk)!f(x)F(x)k1[1F(x)]nkf_{X_{(k)}}(x) = \frac{n!}{(k-1)!(n-k)!} f(x) F(x)^{k-1} [1 - F(x)]^{n-k}

期望值

隨機變量的期望值

單變量

  • 離散隨機變量 XX 的期望值: E(X)=xxp(x)\displaystyle E(X) = \sum_{x} x p(x)
  • 連續隨機變量 XX 的期望值: E(X)=xf(x)dx\displaystyle E(X) = \int_{-\infty}^{\infty} x f(x) dx

多變量

  • 離散隨機變量 X1,X2,...,XnX_1, X_2, ..., X_n 的聯合期望值:

    E(X1,X2,...,Xn)=x1x2...xnx1x2...xnp(x1,x2,...,xn)\displaystyle E(X_1, X_2, ..., X_n) = \sum_{x_1} \sum_{x_2} ... \sum_{x_n} x_1 x_2 ... x_n p(x_1, x_2, ..., x_n)

  • 連續隨機變量 X1,X2,...,XnX_1, X_2, ..., X_n 的聯合期望值:

    E(X1,X2,...,Xn)=...x1x2...xnf(x1,x2,...,xn)dx1dx2...dxn\displaystyle E(X_1, X_2, ..., X_n) = \int_{-\infty}^{\infty} ... \int_{-\infty}^{\infty} x_1 x_2 ... x_n f(x_1, x_2, ..., x_n) dx_1 dx_2 ... dx_n

變異數和標準差

定義

  • 隨機變量 XX 的變異數: Var(X)=E[(XE(X))2]=E(X2)[E(X)]2\displaystyle Var(X) = E[(X - E(X))^2] = E(X^2) - [E(X)]^2

    Var(X)=E((Xμ)2)=E(X2+μ22Xμ)=E(X2)+μ22μE(X)=E(X2)E(X)2Var(X) = E((X - \mu)^2) =E(X^2+ \mu^2 - 2X\mu) = E(X^2) + \mu^2 - 2\mu E(X) = E(X^2) - E(X)^2

  • 隨機變量 XX 的標準差: SD(X)=Var(X)\displaystyle SD(X) = \sqrt{Var(X)}

連續&離散

  • 離散隨機變量 XX 的變異數: Var(X)=x(xE(X))2p(x)\displaystyle Var(X) = \sum_{x} (x - E(X))^2 p(x)
  • 連續隨機變量 XX 的變異數: Var(X)=(xE(X))2f(x)dx\displaystyle Var(X) = \int_{-\infty}^{\infty} (x - E(X))^2 f(x) dx

柴比雪夫不等式 (Chebyshev’s Inequality)

  • 對於任意隨機變量 XX 和任意實數 k>0k > 0,有: $$\displaystyle P(|X - E(X)| \geq k) \leq \frac{Var(X)}{k^2}$$
  • 通過馬可夫不等式,有:
    P(XE(X)k)=P((XE(X))2k2)E[(XE(X))2]k2=Var(X)k2\displaystyle P(|X - E(X)| \geq k) = P((X - E(X))^2 \geq k^2) \leq \frac{E[(X - E(X))^2]}{k^2} = \frac{Var(X)}{k^2}

協方差和相關係數

定義

  • 隨機變量 X,YX, Y 的協方差:

    Cov(X,Y)=E[(XE(X))(YE(Y))]=E(XY)E(X)E(Y)\displaystyle Cov(X, Y) = E[(X - E(X))(Y - E(Y))] = E(XY) - E(X)E(Y)

    Cov(X,Y)=E(XYXE(Y)YE(X)+E(X)E(Y))=E(XY)E(X)E(Y)\displaystyle Cov(X, Y) = E(XY - XE(Y) - YE(X) + E(X)E(Y)) = E(XY) - E(X)E(Y)

    • 如果 X,YX, Y 獨立,則 Cov(X,Y)=E(XY)E(X)E(Y)=E(X)E(Y)E(X)E(Y)=0Cov(X, Y) = E(XY) - E(X)E(Y) = E(X)E(Y) - E(X)E(Y) = 0 (反之不成立)
    • 協方的性質
      • Cov(a+X,Y)=Cov(X,Y)Cov(a+X, Y) = Cov(X, Y) where aa is a constant

        Cov(a+X,Y)=E((a+X)Y)E(a+X)E(Y)=E(aY)+E(XY)aE(Y)+E(X)E(Y)Cov(a+X, Y) = E((a+X)Y) - E(a+X)E(Y) = E(aY) + E(XY) - aE(Y) + E(X)E(Y)
        因此加減常數不影響協方差

      • Cov(aX,bY)=abCov(X,Y)Cov(aX, bY) = ab Cov(X, Y) where a,ba, b are constants

        Cov(aX,bY)=E(aXbY)E(aX)E(bY)=abE(XY)abE(X)E(Y)=abCov(X,Y)Cov(aX, bY) = E(aXbY) - E(aX)E(bY) = abE(XY) - abE(X)E(Y) = abCov(X, Y)

      • Cov(X,Y+Z)=Cov(X,Y)+Cov(X,Z)Cov(X, Y+Z) = Cov(X, Y) + Cov(X, Z)

        Cov(X,Y+Z)=E(X(Y+Z))E(X)E(Y+Z)=E(XY)+E(XZ)E(X)E(Y)E(X)E(Z)=Cov(X,Y)+Cov(X,Z)Cov(X, Y+Z) = E(X(Y+Z)) - E(X)E(Y+Z) = E(XY) + E(XZ) - E(X)E(Y) - E(X)E(Z) = Cov(X, Y) + Cov(X, Z)

      • Cov(aW+bX,cY+dZ)=acCov(W,Y)+adCov(W,Z)+bcCov(X,Y)+bdCov(X,Z)Cov(aW+bX, cY+dZ) = ac Cov(W, Y) + ad Cov(W, Z) + bc Cov(X, Y) + bd Cov(X, Z)

相關係數

如果 X,YX, Y 是聯合分布的兩個隨機變量,而協方差和變異數存在,則 X,YX, Y 的相關係數定義為:

ρ(X,Y)=Cov(X,Y)Var(X)Var(Y)\displaystyle \rho(X, Y) = \frac{Cov(X, Y)}{\sqrt{Var(X)Var(Y)}}

  • 因此而有Cauchy-Schwarz不等式:

    Cov(X,Y)Var(X)Var(Y)\displaystyle |Cov(X, Y)| \leq \sqrt{Var(X)Var(Y)}

    證明:
    Z=XCov(X,Y)Var(Y)Y\displaystyle Z = X - \frac{Cov(X, Y)}{Var(Y)}Y
    0Var(Z)=Cov(Z,Z)=Cov(XCov(X,Y)Var(Y)Y,XCov(X,Y)Var(Y)Y)=Var(X)Cov(X,Y)2Var(Y)Cov(X,Y)2Var(X)Var(Y)Var(X)Var(Y)Cov(X,Y)Var(X)Var(Y)1Cov(X,Y)Var(X)Var(Y)1\begin{aligned} 0 \leq Var(Z) &= Cov(Z, Z) \\ &= Cov(X - \frac{Cov(X, Y)}{Var(Y)}Y, X - \frac{Cov(X, Y)}{Var(Y)}Y) \\ &= Var(X) - \frac{Cov(X, Y)^2}{Var(Y)} \\ &\rightarrow Cov(X, Y)^2 \leq Var(X)Var(Y) \\ &\rightarrow -\sqrt{Var(X)Var(Y)} \leq Cov(X, Y) \leq \sqrt{Var(X)Var(Y)} \\ &\rightarrow -1 \leq \frac{Cov(X, Y)}{\sqrt{Var(X)Var(Y)}} \leq 1 \end{aligned}

條件期望值

隨機變量的條件期望值

  • 離散: E(YX=x)=yxP(Y=yX=x)\displaystyle E(Y|X=x) = \sum_{y} x P(Y=y|X=x) = yyP(YX)(yx)\displaystyle \sum_{y} y P_{(Y|X)}(y|x)

  • 連續: E(YX=x)=yf(YX)(yx)dx\displaystyle E(Y|X=x) = \int_{-\infty}^{\infty} y f_{(Y|X)}(y|x) dx

    一般來說,E(h(Y)X=x)=h(y)f(YX)(yx)dx\displaystyle E(h(Y)|X=x) = \int_{-\infty}^{\infty} h(y) f_{(Y|X)}(y|x) dx

Thm: Law of Total Expectation

定義

X,YX, Y 是兩個隨機變量,則 E(Y)=E(E(YX))E(Y) = E(E(Y|X))

證明:
E(E(YX))=E(YX=x)fX(x)dx=[ xf(YX)(yx)dy] fX(x)dx=yf(x,y)dxdy=yfY(y)dy=E(Y)\begin{aligned} E(E(Y|X)) &= \int_{-\infty}^{\infty} E(Y|X=x) f_X(x) dx \\ &= \int_{-\infty}^{\infty} [\ \int_{-\infty}^{\infty} x f_{(Y|X)}(y|x) dy]\ f_X(x) dx \\ &= \int_{-\infty}^{\infty} y \int_{-\infty}^{\infty} f(x, y) dx dy \\ &= \int_{-\infty}^{\infty} y f_Y(y) dy \\ &= E(Y) \end{aligned}

Thm: Law of Total Variance

定義

X,YX, Y 是兩個隨機變量,則 Var(Y)=E(Var(YX))+Var(E(YX))Var(Y) = E(Var(Y|X)) + Var(E(Y|X))

The moment generating function

定義

  • 隨機變量 XX 的矩生成函數: MX(t)=E(etX)\displaystyle M_X(t) = E(e^{tX})
  • 離散: MX(t)=xetxp(x)\displaystyle M_X(t) = \sum_{x} e^{tx} p(x)
  • 連續: MX(t)=etxf(x)dx\displaystyle M_X(t) = \int_{-\infty}^{\infty} e^{tx} f(x) dx

性質

  • 如果mgf對於所有含0的開區間 tt 都存在,則唯一確定了隨機變量的分佈
  • 若期望值存在則第 rr 矩: E(Xr)=MX(r)(0)\displaystyle E(X^r) = M_X^{(r)}(0)
  • XX 有mgf,且 Y=a+bXY = a + bXMY(t)=E(et(a+bX))=eatE(e(bt)X)=eatMX(bt)\displaystyle M_Y(t) = E(e^{t(a+bX)}) = e^{at} E(e^{(bt)X}) = e^{at} M_X(bt)
  • X,YX, Y 獨立,則 MX+Y(t)=MX(t)MY(t)\displaystyle M_{X+Y}(t) = M_X(t)M_Y(t)

極限定理

大數法則

nn \rightarrow \infty 時,樣本平均值 Xˉ=1ni=1nXi\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i 將收斂到期望值 E(X)E(X)

弱大數法則

  • 如果 X1,X2,...,XnX_1, X_2, ..., X_n 是獨立同分佈的隨機變量,且 E(Xi)=μE(X_i) = \muVar(Xi)=σ2Var(X_i) = \sigma^2i\forall i,
    Let Xˉ=1ni=1nXi\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i
    則對於任意 ϵ>0\epsilon > 0,有: P(Xˉμϵ)0\displaystyle P(|\bar{X} - \mu| \geq \epsilon) \rightarrow 0,當 nn \rightarrow \infty

    我們可以說,樣本平均值 Xˉ\bar{X} 機率收斂到期望值 μ\mu, 可寫成 Xˉpμ\bar{X} \xrightarrow{p} \mu as nn \rightarrow \infty

證明:

  1. E(Xˉ)=E(1ni=1nXi)=1ni=1nE(Xi)=μE(\bar{X}) = E(\frac{1}{n} \sum_{i=1}^{n} X_i) = \frac{1}{n} \sum_{i=1}^{n} E(X_i) = \mu
  2. Var(Xˉ)=Var(1ni=1nXi)=1n2i=1nVar(Xi)=σ2nVar(\bar{X}) = Var(\frac{1}{n} \sum_{i=1}^{n} X_i) = \frac{1}{n^2} \sum_{i=1}^{n} Var(X_i) = \frac{\sigma^2}{n}
  3. By Chebyshev’s Inequality:
    P(Xˉμϵ)σ2nϵ20\displaystyle P(|\bar{X} - \mu| \geq \epsilon) \leq \frac{\sigma^2}{n\epsilon^2} \rightarrow 0 as nn \rightarrow \infty. Hence, Xˉpμ\bar{X} \xrightarrow{p} \mu as nn \rightarrow \infty

continuous mapping theorem

YnY_n 是以 nn 為索引的隨機變量序列,如果 YnpYY_n \xrightarrow{p} Y,且 g(x)g(x) 是一個連續函數,則 g(Yn)pg(Y)g(Y_n) \xrightarrow{p} g(Y)

中心極限定理 (Central Limit Theorem)

  • 如果 X1,X2,...,XnX_1, X_2, ..., X_n 是獨立同分佈的隨機變量,且 E(Xi)=μE(X_i) = \muVar(Xi)=σ2Var(X_i) = \sigma^2,有共同的cdf為 FF 和mgf為 MM

    Sn=i=1nXi\displaystyle S_n = \sum_{i=1}^{n} X_i,則 limnP(Snuσnx)=Φ(x)\displaystyle \lim_{n \rightarrow \infty} P(\frac{S_nu}{\sigma \sqrt{n}} \leq x) = \Phi(x)<x<-\infty < x < \infty

    換句話說, SnσndZ\displaystyle \frac{S_n}{\sigma \sqrt{n}} \xrightarrow{d} Z,as nn \rightarrow \infty,其中 ZN(0,1)Z \rightarrow N(0, 1)

    或者 Xˉ=1ni=1nXi\displaystyle \bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i
    Snσn=1σ1ni=1nXi=nXnˉσdZ\displaystyle \frac{S_n}{\sigma \sqrt{n}} = \frac{1}{\sigma} \frac{1}{\sqrt{n}} \sum_{i=1}^{n} X_i = \frac{\sqrt{n} \bar{X_n}}{\sigma} \xrightarrow{d} Z

  • 證明:
    X1,X2,...,XnX_1, X_2, ..., X_n 是獨立同分佈的隨機變量,且 E(Xi)=μE(X_i) = \muVar(Xi)=σ2Var(X_i) = \sigma^2

    • 假設共同 mgf MX(t)\coloneqq M_X(t) 存在並定義在 00 的鄰域
    1. 不失一般性,假設 μ=0\mu = 0
      Let Zn=nσXnˉ=1σn(X1+X2+...+Xn)\displaystyle Z_n = \frac{\sqrt{n}}{\sigma}\bar{X_n} = \frac{1}{\sigma \sqrt{n}}(X_1 + X_2 + ... + X_n)
      MZn(t)=M1σn(X1+X2+...+Xn)(t)=MX(tσn)n\displaystyle M_{Z_n}(t) = M_{\frac{1}{\sigma \sqrt{n}}(X_1 + X_2 + ... + X_n)}(t) = M_X(\frac{t}{\sigma \sqrt{n}})^n

    2. Consider Taylor expansion of MX(t)M_X(t) around t=0t = 0
      MX(t)=MX(0)+MX(0)t+12t2MX(0)\displaystyle M_X(t) = M_X(0) + M_X'(0)t + \frac{1}{2}t^2 M_X''(0) where ϵ(t)\epsilon(t) is the remainder term ϵ(t)t20\displaystyle \frac{\epsilon(t)}{t^2} \rightarrow 0 as t0t \rightarrow 0

      MX(0)=1\displaystyle M_X'(0) = 1, tMX(0)=E(X)=0\displaystyle tM_X'(0) = E(X) = 0, MX(0)=σ2\displaystyle M_X''(0) = \sigma^2

    3. Thus MX(tσn)=1+12(tσn)2σ2+ϵ(tσn)=1+t22n+ϵ(n)\displaystyle M_{X}(\frac{t}{\sigma \sqrt{n}}) = 1 + \frac{1}{2}(\frac{t}{\sigma \sqrt{n}})^2 \sigma^2 + \epsilon(\frac{t}{\sigma \sqrt{n}}) = 1 + \frac{t^2}{2n} + \epsilon(n)
      MZn(t)=MX(tσn)n=(1+t22n+ϵ(n))net22\displaystyle M_{Z_n}(t) = M_X(\frac{t}{\sigma \sqrt{n}})^n = (1 + \frac{t^2}{2n} + \epsilon(n))^n \rightarrow e^{\frac{t^2}{2}} as nn \rightarrow \infty

    4. The mgf of ZnZ_n converges to et22e^{\frac{t^2}{2}}, which is the mgf of N(0,1)N(0, 1)

      ZndN(0,1)\displaystyle Z_n \xrightarrow{d} N(0, 1) as nn \rightarrow \infty

      Snσn=nXnˉσdN(0,1)\Rightarrow \displaystyle \frac{S_n}{\sigma \sqrt{n}} = \frac{\sqrt{n} \bar{X_n}}{\sigma} \xrightarrow{d} N(0, 1) as nn \rightarrow \infty