前注: estimator = 估计器
宗旨:好奇的可以看一看,我觉得一般人用不到证明,如果你实在看不懂也没事,记住定理和定理带来的性质就行了。
1.1 证明某一个estimator是无偏的
举例:哪一个estimator对于实际方差是无偏的:
(1)\( \frac{1}{n-1} \sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}\) 还是 (2) \(\frac{1}{n} \sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}\)
证明开始辽,小火鸡们。
第一步。求出一个式子。
\( \begin{aligned}
E\left[\sum_{i}\left(X_{i}-\bar{X}\right)^{2}\right] =E\left[\sum_{i}\left(X_{i}-\mu\right)^{2}\right]-E\left[n(\bar{X}-\mu)^{2}\right] \
=\sum_{i} E\left[\left(X_{i}-\mu\right)^{2}\right]-n E\left[(\bar{X}-\mu)^{2}\right] \end{aligned} \)
\( \begin{aligned}
=\sum_{i} \operatorname{var}\left[X_{i}\right]-n * \operatorname{var}[\bar{X}] \
=\sum_{i} \sigma^{2}-n * \frac{\sigma^{2}}{n} \
=n \sigma^{2}-\sigma^{2}\
=(n-1) \sigma^{2}
\end{aligned} \) ——标记为(A)
第二步:
- (2) (A)两边同时除以n \( \rightarrow )
\( E\left[\frac{1}{n} \sum_{i}\left(X_{i}-\bar{X}\right)^{2}\right]=\frac{n-1}{n} \sigma^{2}\)
所以So, \(\frac{1}{n} \sum_{i}\left(X_{i}-\bar{X}\right)^{2}\) is a biased estimator of \( \sigma^{2}\) - (2) (A)两边同时除以n-1 \( \rightarrow )
\( E\left[\frac{1}{n-1} \sum_{i}\left(X_{i}-\bar{X}\right)^{2}\right]= \sigma^{2}\)
所以So, \(\frac{1}{n-1} \sum_{i}\left(X_{i}-\bar{X}\right)^{2}\) is a unbiased estimator of \( \sigma^{2}\).
1.2 证明某一个 estimator 对于 某一个参数是一致的
举例: 样本方差 \(S^2\) 是个一致的estimator对于 \( \mu^2\)来说。
\( \begin{aligned}
S^{2}=\frac{1}{n-1} \sum_{i}\left(X_{i}-\bar{X}\right)^{2} &=\left(\frac{n}{n-1}\right)\left(\frac{1}{n} \sum_{i}\left(X_{i}-\bar{X}\right)^{2}\right) \end{aligned} \)
\( \begin{aligned}
&=\left(\frac{n}{n-1}\right)\left(\frac{1}{n}\left[\sum_{i} X_{i}^{2}-n \bar{X}^{2}\right]\right) \end{aligned} \)
\( \begin{aligned}
&=\left(\frac{n}{n-1}\right)\left(\frac{1}{n} \sum_{i} X_{i}^{2}-\bar{X}^{2}\right) \end{aligned} \)
\( \begin{aligned}
&=\left(\frac{n}{n-1}\right)\left(\frac{1}{n} \sum_{i} X_{i}^{2}-(\bar{X})^{2}\right) \end{aligned} \)
\( \begin{aligned}
\Longrightarrow S^{2} & \stackrel{p}{\rightarrow}(1)\left(E\left[X^{2}\right]-(E[X])^{2}\right)=\sigma^{2}
\end{aligned} \)
证明结束。
1.3 证明某一个 estimator 对于 某一个参数是MSE一致(MSE CONSISTENT )的
举例:
for \( N\left(\mu, \sigma^{2}\right)\)
\( \operatorname{MSE}(\bar{X})=\sigma^{2} / n \rightarrow 0\) as \(n \rightarrow \infty\)
Therefore \( \bar{X}\) is a MSE consistent estimator of \( \mu\)
1.4 证明克拉美-罗下界
不是闲的发毛别往下看。
证明目标:
\( \begin{aligned} D(\hat{\theta}) \geq \frac{1}{-\mathbb{E}\left[\frac{\partial^{2} \ln p(\boldsymbol{x} ; \theta)}{\partial \theta^{2}}\right]} \end{aligned} \)
其中 \( \begin{aligned} \mathbf{I}(\theta)=-\mathbb{E}\left[\frac{\partial^{2} \ln p(\boldsymbol{x} ; \theta)}{\partial \theta^{2}}\right] \end{aligned} \) 就是fisher information
由于\( \hat{\theta}\)是无偏的:所以E[\( \hat{\theta}\)] = E[\( \theta\)] \( \rightarrow \) (期望值的定义)
\( \begin{align} \int_{x}(\hat{\theta}-\theta) p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}=0 \quad \Rightarrow \quad \int \hat{\theta} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}=\theta \end{align}\)
注意,估计器\( \hat{\theta}\) 是关于观测量\( \boldsymbol{x}x \)的函数。上式等式两边对\( \hat{\theta}\)求偏导有
\( \int \hat{\theta} \frac{\partial p(\boldsymbol{x} ; \boldsymbol{\theta})}{\partial \theta} \mathrm{d} x=1 \)
\(\Rightarrow \int \hat{\theta} \frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta} p(\boldsymbol{x} ; \theta) \mathrm{d} x=1 \quad—(* 1) \)
注意:\( \mathbb{E}\left[\frac{\partial \ln p(\boldsymbol{x};\theta)}{\partial \theta}\right] \) 就是MLE,别懵。下文有那个二次求导就是MLE对于\( \theta\)的偏导
由正则条件E\( \mathbb{E}\left[\frac{\partial \ln p(\boldsymbol{x};\theta)}{\partial \theta}\right] \)= 0 ,即
\( \int \frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}=0 \) 得出:
\( \begin{aligned}
& \theta \int \frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}=0 \
\Rightarrow \int \theta \frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x} &=0—(* 2)
\end{aligned} \)
合并(1) (2):
\( \begin{array}{c}
\int(\hat{\theta}-\theta) \frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}=1 \
\Rightarrow \int(\hat{\theta}-\theta) \sqrt{p(\boldsymbol{x} ; \theta)} \frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta} \sqrt{p(\boldsymbol{x} ; \theta)} \mathrm{d} \boldsymbol{x}=1
\end{array} \)
由于柯西-施瓦茨不等式:
\( \int f^{2}(x) \mathrm{d} x \int g^{2}(x) \mathrm{d} x \geq\left(\int f(x) g(x) \mathrm{d} x\right)^{2}\)
然后有:
\( \begin{array}{l}
\left(\int(\hat{\theta}-\theta)^{2} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}\right)\left(\int\left(\frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta}\right)^{2} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}\right) \geq 1 \
\quad \Rightarrow \int(\hat{\theta}-\theta)^{2} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x} \geq \frac{1}{\left(\int\left(\frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta}\right)^{2} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}\right)}
\end{array}\) 即:
\(D(\hat{\theta}) \geq \frac{1}{\mathbb{E}\left[\left(\frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta}\right)^{2}\right]}\)
注意: D(X) 就是X的方差(variance )
现在只需要证明
\( \begin{aligned} \mathbb{E}\left[\left(\frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta}\right)^{2}\right]=-\mathbb{E}\left[\frac{\partial^{2} \ln p(\boldsymbol{x} ; \theta)}{\partial \theta^{2}}\right] \end{aligned}\)
- 上文提及 E\( \mathbb{E}\left[\frac{\partial \ln p(\boldsymbol{x};\theta)}{\partial \theta}\right] \)= 0 ,对两边对\(\theta\)求偏导:
\( \begin{aligned}
& \frac{\partial}{\partial \theta} \int \frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}=0 \end{aligned} \)
\( \begin{aligned}
\Rightarrow & \int\left[\frac{\partial^{2} \ln p(\boldsymbol{x} ; \theta)}{\partial \theta^{2}} p(\boldsymbol{x} ; \theta)+\frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta} \frac{\partial p(\boldsymbol{x} ; \theta)}{\partial \theta}\right]_{\partial} \mathrm{d} \boldsymbol{x}=0 \end{aligned} \)
\( \begin{aligned}
\Rightarrow \int \frac{\partial^{2} \ln p(\boldsymbol{x} ; \theta)}{\partial \theta^{2}} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}=-& \int\left(\frac{\partial \ln p(\boldsymbol{x} ; \theta)}{\partial \theta}\right)^{2} p(\boldsymbol{x} ; \theta) \mathrm{d} \boldsymbol{x}
\end{aligned}\) 注意,链式法则
标记\( \mathbf{I}(\theta)=-\mathbb{E}\left[\frac{\partial^{2} \ln p(\boldsymbol{x} ; \theta)}{\partial \theta^{2}}\right] \) (fisher information) 完了
完了