Statistics

An overview of statistics

Jargons in statistics, accuracy, precision, population, sample, etl

Detecting correlations using Pearson's chi square correlation test

Detecting correlations using correlations for numeric data

Detecting correlations using correlations for numerical data

Linear regression of multidimensional data

Given two uniform marginals, we can apply the inverse cdf of a continuous distribution to form a new …

The Box-Cox transformation transforms data into Gaussian data, which is especially useful in feature engineering, e.g., fixing irregularities in variances of a time series.

Likelihood is not necessarily a pdf

Using [[Fraser information]] Fraser Information The Fraser information is $$ I_F(\theta) = \int g(X) …

Kullback–Leibler divergence indicates the differences between two distributions

Reparametrize the sampling distribution to simplify the sampling

MDL is a measure of how well a model compresses data by minimizing the combined cost of the description of the model and the misfit.

Bonferroni correction is very useful in a multiple comparison problem

In a multiple comparisons problem, we deal with multiple statistical tests simultaneously. Examples …

Arcsine Distribution The PDF is $$ \frac{1}{\pi\sqrt{x(1-x)}} $$ for $x\in [0,1]$. It can also be …

Two categories with probability $p$ and $1-p$ respectively. For each experiment, the sample space is …

Beta Distribution Interact Alpha Beta mode ((beta_mode)) median ((beta_median)) mean ((beta_mean)) …