# Poisson Process

Published: 2019-06-18

Category: { Statistics }

Tags:

References:
- axelpale/poisson-process

Summary:

Pages: 24

# Bayes' Theorem

Published: 2019-06-18

Category: { Math }

Tags:

References:
- Tree Diagram of Bayes Theorem @ Wikipedia

Summary: Bayes’ Theorem is stated as
$$ P(A\mid B) = \frac{P(B \mid A) P(A)}{P(B)} $$
$P(A\mid B)$: likelihood of A given B $P(A)$: marginal probability of A There is a nice tree diagram for the Bayes’ theorem on Wikipedia.
Tree diagram of Bayes’ theorem

Pages: 24

# Kendall Tau Correlation

Published: 2019-07-20

Category: { Statistics }

Tags:

References:
- Kendall, M. G. (1938). A new measure of correlation. Biometrika, 30(1–2), 81–93.
- Kendall rank correlation coefficient
- Rank correlation

Summary: Definition two series of data: $X$ and $Y$ cooccurance of them: $(x_i, x_j)$, and we assume that $i<j$ concordant: $x_i < x_j$ and $y_i < y_j$; $x_i > x_j$ and $y_i > y_j$; denoted as $C$ discordant: $x_i < x_j$ and $y_i > y_j$; $x_i > x_j$ and $y_i < y_j$; denoted as $D$ neither concordant nor discordant: whenever equal sign happens Kendall’s tau is defined as
$$ \begin{equation} \tau = \frac{C- D}{\text{all possible pairs of comparison}} = \frac{C- D}{n^2/2 - n/2} \end{equation} $$

Pages: 24

# Jackknife Resampling

Published: 2020-01-26

Category: { Statistics }

Tags:

References:
- Jackknife Resampling

Summary: Jackknife resampling method

Pages: 24

# Covariance Matrix

Published: 2020-03-10

Category: { Math }

Tags:

References:
- Covariance Matrix @ Wikipedia

Summary: Also known as the second central moment is a measurement of the spread.

Pages: 24

# Gamma Distribution

Published: 2020-03-14

Category: { Statistics }

Tags:

Summary: Gamma Distribution PDF:
$$ \frac{\beta^\alpha x^{\alpha-1} e^{-\beta x}}{\Gamma(\alpha)} $$
Visualize

Pages: 24

# Cauchy-Lorentz Distribution

Published: 2020-03-14

Category: { Statistics }

Tags:

Summary: Cauchy-Lorentz Distribution .. ratio of two independent normally distributed random variables with mean zero.
Source: https://en.wikipedia.org/wiki/Cauchy_distribution
Lorentz distribution is frequently used in physics.
PDF:
$$ \frac{1}{\pi\gamma} \left( \frac{\gamma^2}{ (x-x_0)^2 + \gamma^2} \right) $$
The median and mode of the Cauchy-Lorentz distribution is always $x_0$. $\gamma$ is the FWHM.
Visualize

Pages: 24

# Categorical Distribution

Published: 2020-03-14

Category: { Statistics }

Tags:

References:
- Categorical Distribution @ Wikipedia

Summary: By generalizing the Bernoulli distribution to $k$ states, we get a categorical distribution. The sample space is $\{s_1, s_2, \cdots, s_k\}$. The corresponding probabilities for each state are $\{p_1, p_2, \cdots, p_k\}$ with the constraint $\sum_{i=1}^k p_i = 1$.

Pages: 24

# Binomial Distribution

Published: 2020-03-14

Category: { Statistics }

Tags:

References:
- Binomial Distribution @ Wikipedia

Summary: The number of successes in $n$ independent events where each trial has a success rate of $p$.
PMF:
$$ C_n^k p^k (1-p)^{n-k} $$

Pages: 24

# Beta Distribution

Published: 2020-03-14

Category: { Statistics }

Tags:

Summary: Beta Distribution Interact {% include extras/vue.html %}
((makeGraph))

Pages: 24

# Bernoulli Distribution

Published: 2020-03-14

Category: { Statistics }

Tags:

References:
- Bernoulli Distribution @ Wikipedia

Summary: Two categories with probability $p$ and $1-p$ respectively.
For each experiment, the sample space is $\{A, B\}$. The probability for state $A$ is given by $p$ and the probability for state $B$ is given by $1-p$. The Bernoulli distribution describes the probability of $K$ results with state $s$ being $s=A$ and $N-K$ results with state $s$ being $B$ after $N$ experiments,
$$ P\left(\sum_i^N s_i = K \right) = C _ N^K p^K (1 - p)^{N-K}.

Pages: 24

# Arcsine Distribution

Published: 2020-03-14

Category: { Statistics }

Tags:

Summary: Arcsine Distribution The PDF is
$$ \frac{1}{\pi\sqrt{x(1-x)}} $$
for $x\in [0,1]$.
It can also be generalized to
$$ \frac{1}{\pi\sqrt{(x-1)(b-x)}} $$
for $x\in [a,b]$.
Visualize

Pages: 24

# Conditional Probability Table

Published: 2020-10-27

Category: { Math }

Tags:

References:
- Tree Diagram of Bayes' Theorem @ Wikipedia

Summary: The conditional probability table is also called CPT

Pages: 24

# Normalized Maximum Likelihood

Published: 2020-11-08

Tags:

Summary: $$ \mathrm{NML} = \frac{ p(y| \hat \theta(y)) }{ \int_X p( x| \hat \theta (x) ) dx } $$

Pages: 24

# Minimum Description Length

Published: 2020-11-08

Tags:

References:
- Vandekerckhove, J., & Matzke, D. (2015). Model comparison and the principle of parsimony. Oxford Library of Psychology.
- Grünwald, P. D. (2007). The Minimum Description Length Principle. MIT Press.

Summary: MDL is a measure of how well a model compresses data by minimizing the combined cost of the description of the model and the misfit.

Pages: 24

# Kolmogorov Complexity

Published: 2020-11-08

Tags:

References:
- Fortnow, L. (2000). Kolmogorov complexity. (January), 1–14.
- Grünwald, P. D. (2007). The Minimum Description Length Principle. MIT Press.

Summary: Description of Data
The measurement of complexity is based on the observation that the compressibility of data doesn’t depend on the “language” used to describe the compression process that much. This makes it possible for us to find a universal language, such as a universal computer language, to quantify the compressibility of the data.
One intuitive idea is to use a programming language to describe the data. If we have a sequence of data,

Pages: 24

# Fisher Information Approximation

Published: 2020-11-08

Tags:

Summary: FIA is a method to describe the minimum description length ( MDL Minimum Description Length MDL is a measure of how well a model compresses data by minimizing the combined cost of the description of the model and the misfit. ) of models,
$$ \mathrm{FIA} = -\ln p(y | \hat\theta) + \frac{k}{2} \ln \frac{n}{2\pi} + \ln \int_\Theta \sqrt{ \operatorname{det}[I(\theta)] d\theta } $$
$I(\theta)$: Fisher information matrix of sample size 1.

Pages: 24

# Bayesian Information Criterion

Published: 2020-11-08

Tags:

References:
- Vandekerckhove, J., & Matzke, D. (2015). Model comparison and the principle of parsimony. Oxford Library of Psychology.

Summary: BIC considers the number of parameters and the total number of data records.

Pages: 24

# Bayes Factors

Published: 2020-11-08

Tags:

Summary: $$ \frac{p(\mathscr M_1|y)}{ p(\mathscr M_2|y) } = \frac{p(\mathscr M_1)}{ p(\mathscr M_2) }\frac{p(y|\mathscr M_1)}{ p(y|\mathscr M_2) } $$
Bayes factor
$$ \mathrm{BF_{12}} = \frac{m(y|\mathscr M_1)}{m(y|\mathscr M_2)} $$
$\mathrm{BF_{12}}$: how many time more likely is model $\mathscr M_1$ than $\mathscr M_2$.

Pages: 24

# Akaike Information Criterion

Published: 2020-11-08

Tags:

References:
- Akaike Information Criterion @ Wikipedia
- Vandekerckhove, J., & Matzke, D. (2015). Model comparison and the principle of parsimony. Oxford Library of Psychology.

Summary: Suppose we have a model that describes the data generation process behind a dataset. The distribution by the model is denoted as $\hat f$. The actual data generation process is described by a distribution $f$.
We ask the question:
How good is the approximation using $\hat f$?
To be more precise, how much information is lost if we use our model dist $\hat f$ to substitute the actual data generation distribution $f$?

Pages: 24