概率回顾

最好的描述概率的方式就是frequency, 可以理解为某一元素在集合中出现的次数比上集合元素总数.

条件概率

公式为: $P(A|B)\equiv\frac{P(A\cap B)}{P(B)}$.

表示已知B发生, 那么A发生的概率.

如果$P(A|B)=P(A)$, 那么称A,B独立, 即: $P(A)P(B)=P(A\cap B)$.

根据以上关系, 可以推出全概率公式:

$P(A)=\sum_{i=1}^{n}P(B_{i})P(A|B_{i})$, for each $B_{i}\in {{B_{1},B_{2},\ldots,B_{n}}}$ are independent with each other.

DISCRETE DISTRIBUTION

for X = ${x_{1},x_{2},..,x_{n}}$ with probability $P={p_{1},p_{2},…,p_{n}}$, and satisfies $p_{k}\ge 0$ and $\sum_{k}p_{k}=1$, then X is discretely distributed.

CONTINUOUS DISTRIBUTION

X is a continuous set. The pdf of X is $f(x)$ and it satisfies:

$\begin{cases}f(x)\geq0,\forall x\\ \int_{-\infty}^{\infty}f(x)dx=1\\P(a\leq x\leq b)=\int_{a}^{b}f(x)dx\end{cases}$

The cdf of X is $F(x)$ and is defined as: $F(x)\equiv P(-\infty\le X\le x)=\int_{-\infty}^{x}f(x)dx$.

MULTIVARIATE DISTRIBUTION

Use two dimension as example: the joint pdf f(x,y) should satisfy:

$\begin{cases}f(x,y)\geq0,\forall x,y\\ \int\limits_{-\infty}^{\infty}\int\limits_{-\infty}^{\infty}f(x,y)dxdy=1\\ P((X,Y)\in D)=\int\limits_{D}\int f(x,y)dxdy\end{cases}$

there is marginal pdf of x (and y): $f_{X}(x)=\int_{-\infty}^{\infty}f(x,y)dy$

and cdf $\begin{aligned}F(X,Y) & =P(-\infty\leq X\leq x,-\infty\leq Y\leq y)\\ & =\int_{-\infty}^{x}\int_{-\infty}^{y}f(t,s)dtds\end{aligned}$.

CONDITIONAL DISTRIBUTION WITH CONTINUOUS VARIABLE

$f(y|x)=\frac{f(x,y)}{f_{X}(x)}$.

this represents the probability of $Y=y$ when $X=x$, which intuitively would be 0, but we consider $o(x,\epsilon),\,\epsilon\to 0^{+}$.

MOMENT

EXPECTATION

for discrete: $E[X]=\mu=\sum_{i=1}^{n}P(x_{i})\cdot x_{i}$

for continuous: $E[X]=\mu=\int_{-\infty}^{\infty}xf(x)dx$

this operator satisfies:

Linearity: $E[kX]=kE[X]$; $E[X+Y]=E[X]+E[Y]$

VARIANCE

$V[X]=\sigma^{2}=E[X-E[X]]^{2}=E[X^{2}]-E[X]^{2}$.

COVARIANCE

$Cov[X,Y]=\sigma_{XY}\equiv E[(X-E[X])(Y-E[Y])]=E[XY]-E[X]E[Y]$.

If cov of X and Y is 0, then they are linearly incorrelated, but this doesn’t mean they are independent.

it satisfies:

Linearity: $Cov(X,Y+Z)=Cov(X,Y)+Cov(X+Z)$.

CORRELATION

$\begin{aligned}\rho\equiv Corr(X,Y)&=\frac{Cov(X,Y)}{\sqrt{V[X]V[Y]}}\\&=\frac{\sigma_{XY}}{\sigma_{X}\sigma_{Y}}\end{aligned}$

SKEWNESS

$S[X]=E[(X-\mu)/\sigma]^{3}$

it measures the asymmertisity of a distribution. If it is fully symmetric, them S[X]=0.

KURTOSIS

$K[X]=E[(X-\mu)/\sigma]^{4}$

it measures the sharpness of a distribution. for a standard normal distribution, the kurtosis is 3. Any dist with kurtosis larger than 3 has a more narrow head than normal dist.

CONDITIONAL MOMENT

$E[Y|X=x]=E[Y|x]=\int_{-\infty}^{\infty}yf(y|x)dy$

$V[Y|X=x]=V[Y|x]=\int_{-\infty}^{\infty}[y-E(Y)]^{2}f(y|x)dy$

INDEPENDENCE

the strongest form of independence is that the value of X will not influence the value of Y.
weaker form is that mean-independence: suppose $E(Y|x)$ exists, and if it is irrelevant with x ($E(Y|x)=E(Y)$), then Y is mean-independent with X. This doesn’t mean X is mean-independent with Y.
weakest form: linearly irrelevant: $Cov(X,Y)=0$