# Normalization Methods for Numeric Data

Normalization of data is critical for statistical analysis and feature engineering.

## Min-max Normalization

This method is linear and straightforward.

Suppose we are analyzing series A, with elements $a_i$. We already know the min and max of the series, $a_{min}$ and $a_{max}$.

Now we would like to normalize the series to be within the range $[a_{min}’, a_{max}’]$. We simply solve the value of $a’ _ i$ in
$$
\frac{(a’*i - a*{min}’)}{ ( a’*{max} - a’*{min} ) } = \frac{(a_i - a_{min})}{ ( a_{max} - a_{min} ) },
$$
where everything on the right hand side is known and $a_{min}’$ and $a_{max}’$ are chosen as the new min and max to be scaled to.

## Z-score Normalization

Z-score normalization method normalizes the data using the standard deviation since standard deviation measures how are the data points devivate from the mean.

$$ a’_i = \frac{ (a_i - \bar A) }{ \sigma_A } $$

## Decimal Scaling

Basically shifting the data with some powers of 10.

$$ a’_i = a_i/ 10^j $$

choose $j$ so that the new values are not larger than 1.

L Ma (2018). 'Normalization Methods for Numeric Data', Datumorphism, 11 April. Available at: https://datumorphism.leima.is/wiki/statistics/normalization-methods/.