Bayesian Information Criterion
Published:
Tags:
References:
- Vandekerckhove, J., & Matzke, D. (2015). Model comparison and the principle of parsimony. Oxford Library of Psychology.
Summary: BIC considers the number of parameters and the total number of data records.
Pages: 31
Bayes Factors
Published:
Tags:
Summary: $$ \frac{p(\mathscr M_1|y)}{ p(\mathscr M_2|y) } = \frac{p(\mathscr M_1)}{ p(\mathscr M_2) }\frac{p(y|\mathscr M_1)}{ p(y|\mathscr M_2) } $$
Bayes factor
$$ \mathrm{BF_{12}} = \frac{m(y|\mathscr M_1)}{m(y|\mathscr M_2)} $$
$\mathrm{BF_{12}}$: how many time more likely is model $\mathscr M_1$ than $\mathscr M_2$.
Pages: 31
Akaike Information Criterion
Published:
Tags:
References:
- Akaike Information Criterion @ Wikipedia
- Vandekerckhove, J., & Matzke, D. (2015). Model comparison and the principle of parsimony. Oxford Library of Psychology.
Summary: Suppose we have a model that describes the data generation process behind a dataset. The distribution by the model is denoted as $\hat f$. The actual data generation process is described by a distribution $f$.
We ask the question:
How good is the approximation using $\hat f$?
To be more precise, how much information is lost if we use our model dist $\hat f$ to substitute the actual data generation distribution $f$?
AIC defines this information loss as
$$ \mathrm{AIC} = - 2 \ln p(y|\hat\theta) + 2k $$
$y$: data set $\hat\theta$: parameter of the model that is estimated by maximum-likelihood $\ln p(y|\hat\theta)$: log maximum likelihood (the goodness-of-fit) $k$: number of adjustable model params; $+2k$ is then a penalty.
Pages: 31
Reparametrization in Expectation Sampling
Published:
Category: { statistics }
Tags:
Summary: Reparametrize the sampling distribution to simplify the sampling
Pages: 31
Explained Variation
Published:
Category: { statistics }
Tags:
References:
- Explained variation
Summary: Using [[Fraser information]] Fraser Information The Fraser information is $$ I_F(\theta) = \int g(X) \ln f(X;\theta) , \mathrm d X. $$ When comparing two models, $\theta_0$ and $\theta_1$, the information gain is $$ \propto (F(\theta_1) - F(\theta_0)). $$ The Fraser information is closed related to [[Fisher information]] Fisher Information Fisher information measures the second moment of the model sensitivity with respect to the parameters. , Shannon information, and [[Kullback information]] KL Divergence Kullback–Leibler divergence indicates … , we can define a relative information gain by a model
$$ \rho_C ^2 = 1 - \frac{ \exp( - 2 F(\theta_1) ) }{ \exp( - 2 F(\theta_0) ) }, $$
Pages: 31
Likelihood
Published:
Category: { Statistics }
Tags:
References:
- Jaynes ET. Probability Theory: The Logic of Science. Cambridge University Press; 2003. doi:10.1017/CBO9780511790423
- Parameter Estimation | CS109: Probability for Computer Scientists
Summary: Likelihood is not necessarily a pdf
Pages: 31
Box-Cox Transformation
Published:
Category: { statistics }
Tags:
References:
- Box GEP, Cox DR. An analysis of transformations. J R Stat Soc. 1964;26: 211–243. doi:10.1111/j.2517-6161.1964.tb00553.x
- Vélez JI, Correa JC, Marmolejo-Ramos F. A new approach to the Box–Cox transformation. Frontiers in Applied Mathematics and Statistics. 2015;1: 12. doi:10.3389/fams.2015.00012
- How to Use Power Transforms for Time Series Forecast Data with Python
Summary: The Box-Cox transformation transforms data into Gaussian data, which is especially useful in feature engineering, e.g., fixing irregularities in variances of a time series.
Pages: 31
Multivariate Normal Distribution
Published:
Category: { Distributions }
Tags:
Summary: Multivariate Gaussian distribution
Pages: 31
Copula
Published:
Category: { statistics }
Tags:
References:
- Quant D. A Simple Introduction to Copulas. YouTube. 2021. Available: https://www.youtube.com/watch?v=WFEzkoK7tsE
- Wiecki T. An intuitive, visual guide to copulas — While My MCMC Gently Samples. In: While My MCMC Gently Samples [Internet]. 3 May 2018 [cited 6 Jan 2023]. Available: https://twiecki.io/blog/2018/05/03/copulas/
- SDV. Introduction to Copulas — Copulas 0.6.0 documentation. In: Copulas [Internet]. 2018 [cited 6 Jan 2023]. Available: https://sdv.dev/Copulas/tutorials/01_Introduction_to_Copulas.html
Summary: Given two uniform marginals, we can apply the inverse cdf of a continuous distribution to form a new joint distribution.
Some examples in this notebook.
Uniform marginals [[Gaussian]] Multivariate Normal Distribution Multivariate Gaussian distribution copula:
Normal, Normal Some other examples:
[[Normal]] Normal Distribution Gaussian distribution and [[Beta]] Beta Distribution Beta Distribution Interact Alpha Beta mode ((beta_mode)) median ((beta_median)) mean ((beta_mean)) ((makeGraph)) : Normal, Beta Gumbel and [[Beta]] Beta Distribution Beta Distribution Interact Alpha Beta mode ((beta_mode)) median ((beta_median)) mean ((beta_mean)) ((makeGraph)) : Gumbel, Beta [[t distribution]] t Distribution t distribution : t, t
Pages: 31
2 t Distribution
Published:
Category: { Distributions }
Tags:
References:
-
Summary: t distribution
Pages: 31
1 Normal Distribution
Published:
Category: { Distributions }
Tags:
Summary: Gaussian distribution
Pages: 31