Normalization Methods for Numeric Data
Normalization of data is critical for statistical analysis and feature engineering.
Min-max Normalization
This method is linear and straightforward.
Suppose we are analyzing series A, with elements $a_i$. We already know the min and max of the series, $a_{min}$ and $a_{max}$.
Now we would like to normalize the series to be within the range $[a_{min}’, a_{max}’]$. We simply solve the value of $a’ _ i$ in $$ \frac{(a’i - a{min}’)}{ ( a’{max} - a’{min} ) } = \frac{(a_i - a_{min})}{ ( a_{max} - a_{min} ) }, $$ where everything on the right hand side is known and $a_{min}’$ and $a_{max}’$ are chosen as the new min and max to be scaled to.
Z-score Normalization
Z-score normalization method normalizes the data using the standard deviation since standard deviation measures how are the data points devivate from the mean.
$$ a’_i = \frac{ (a_i - \bar A) }{ \sigma_A } $$
Decimal Scaling
Basically shifting the data with some powers of 10.
$$ a’_i = a_i/ 10^j $$
choose $j$ so that the new values are not larger than 1.
wiki/statistics/normalization-methods
:L Ma (2018). 'Normalization Methods for Numeric Data', Datumorphism, 11 April. Available at: https://datumorphism.leima.is/wiki/statistics/normalization-methods/.