Linear Regression
#Statistics #Basics #Linear Regression
A Model
A model is an estimator of the data that maps the inputs $\mathbf X$ to the predicted outputs $\hat{\mathbf Y}$, $\hat{\mathbf Y} = F( \mathbf X )$. The map $F$ might require some parameters, ${\boldsymbol\alpha, \boldsymbol\beta, \cdots }$.
Then we should have an estimator that tells us how good the model is given some parameters. For example, we could define a loss function $L(\mathbf Y,\hat{\mathbf Y} )$ that estimates the deficit between the actual data and the predicted results. Then we minimize this deficit.
So a model usually has
- map,
- esitmator.
Linear Model
The model is simple
$$ \hat Y_i = X_{ij}\beta_ j + \beta_0 $$
One might want to create a augmented dataset by including a 0th column $X_{i0} = 1$, so that we have $$ \hat Y_i = X_{ij}\beta_ j $$ with $j=0,1,2,…$.
Using least squares as our estimator, we minimize the RSS loss function $L$ by choosing the suitable parameters $\beta_j$. The loss function is
$$ L = ( Y_i - X_{ij}\beta_j )( Y_i - X_{ik}\beta_k ). $$
Minimizing it ‘requires’ $\partial_{\beta_m} L = 0$ and $\partial_{\beta_m}\partial_{\beta_n} L > 0 $.
We have
$$ \begin{align} \partial_{\beta_m} L =& (\partial_{\beta_m} ( Y_i - X_{ij}\beta_j ) ) ( Y_i - X_{ik}\beta_k ) + ( Y_i - X_{ij}\beta_j ) \partial_{\beta_m} ( Y_i - X_{ik}\beta_k ) \\ =& - X_{ij} \delta_{jm}( Y_i - X_{ik}\beta_k ) + ( Y_i - X_{ij}\beta_j ) ( - X_{ik}\delta_{km} ) \\ =& - 2 X_{im} ( Y_i - X_{ij}\beta_j ) \end{align} $$Solving $- 2 X_{im} ( Y_i - X_{ij}\beta_j ) = 0$, we have
$$ \begin{align} & 0 = X_{im} ( Y_i - X_{ij}\beta_j ) \\ & X_{im} X_{ij}\beta_j = X_{im} Y_i \\ & \beta_j = ( X_{im} X_{ij} )^{-1} X_{im} Y_i \end{align} $$In the abstract matrix notation,
$$ \boldsymbol \beta = ( \mathbf X^T \mathbf X )^{-1} \mathbf X^T \mathbf Y. $$
Table of Contents
Current Ref:
- wiki/statistics/linear-regression.md