跳转至

回归分析

起源:”衰退“(regression)现象

一元线性模型

模型:\(y=ax+b\)

本质:寻找直线,使得尽可能靠近数据点,以最小误差进行拟合

损失:残差平方和\(\dfrac{1}{N}\sum(y-\hat y)^2\)最小

参数求解:最小二乘法

优化目标:

\[ \min_{a,b}L(a,b)=\sum_{i=1}^n(y_i-ax_i-b)^2 \]

求导:

\[ \dfrac{\partial L(a,b)}{\partial b}=0\Rightarrow\sum_{i=1}^n(y_i-ax_i-b)=0 \]
\[ \therefore b=\bar y-a\bar x \]
\[ \dfrac{\partial L(a,b)}{\partial a}=0\Rightarrow a=\dfrac{\sum\limits_{i=1}^nx_iy_i-n\bar x\bar y}{\sum\limits_{i=1}^nx_i^2-n\bar x^2}\]

推广:多维

\[ \begin{array}{l} L(\mathbf X,\mathbf y,\mathbf a)=\dfrac{1}{2n}\left\Vert\mathbf y-\mathbf {Xa}\right\Vert^2\\ \dfrac{\partial}{\partial\mathbf a}L(\mathbf X,\mathbf y,\mathbf a)=-\dfrac{1}{n}(\mathbf{y-Xa})^{\mathsf T}\mathbf X\\ \dfrac{\partial}{\partial\mathbf a}L(\mathbf X,\mathbf y,\mathbf a)=0\Rightarrow \mathbf a^*=(\mathbf{X^{\mathsf T}X})^{-1}\mathbf{Xy} \end{array} \]