# Minimum Description Length

## #MDL #Model Selection

The minimum description length, aka, MDL, is based on the relations between regularity and data compression. (See Kolmogorov complexity Kolmogorov Complexity Description of Data The measurement of complexity is based on the observation that the compressibility of data doesn’t depend on the “language” used to describe the compression process that much. This makes it possible for us to find a universal language, such as a universal computer language, to quantify the compressibility of the data. One intuitive idea is to use a programming language to describe the data. If we have a sequence of data, for more about data descriptions.).

In statistical inferences, given a dataset $\mathcal D$, the compressions of the data is our model $\mathcal M$, or hypothesis $\mathcal H$ in the language of hypothesis testing.

MDL looks for the model that compresses the data well but with a reasonable cost of complexity. The complexity of the model is described by a length $L(\mathcal M)$, the goodness of the model is $G$. MDL looks into the balance of $L(\mathcal M)$ and $G$. For example, one could calculate the length of the model using the number of parameters $k$ in the model, and the goodness of the model using likelihood $p(\mathcal D \mid \mathcal M)$.

There are many versions of MDL: ^{1}

- crude two-part code,
- Fisher information approximation ( FIA Fisher Information Approximation FIA is a method to describe the minimum description length ( MDL Minimum Description Length MDL is a measure of how well a model compresses data by minimizing the combined cost of the description of the model and the misfit. ) of models, $$ \mathrm{FIA} = -\ln p(y | \hat\theta) + \frac{k}{2} \ln \frac{n}{2\pi} + \ln \int_\Theta \sqrt{ \operatorname{det}[I(\theta)] d\theta } $$ $I(\theta)$: Fisher information matrix of sample size 1. ),
- Normalized Maximum likelihood ( NML Normalized Maximum Likelihood $$ \mathrm{NML} = \frac{ p(y| \hat \theta(y)) }{ \int_X p( x| \hat \theta (x) ) dx } $$ ).

Lei Ma (2020). 'Minimum Description Length', Datumorphism, 11 April. Available at: https://datumorphism.leima.is/cards/statistics/mdl/.

**Current Ref:**

- cards/statistics/mdl.md

**Links to:**

###### Kolmogorov Complexity

Description of Data The measurement of complexity is based on the observation that the …

###### Fisher Information Approximation

FIA is a method to describe the minimum description length ( MDL Minimum Description Length MDL is …

###### Normalized Maximum Likelihood

$$ \mathrm{NML} = \frac{ p(y| \hat \theta(y)) }{ \int_X p( x| \hat \theta (x) ) dx } $$

###### Model Selection

Suppose we have a generating process that generates some numbers based on a distribution. Based on a …

###### Likelihood

Likelihood is not necessarily a pdf

**Links from:**

###### Model Selection

Suppose we have a generating process that generates some numbers based on a distribution. Based on a …

###### MDL and Neural Networks

Minimum Description Length ( MDL Minimum Description Length MDL is a measure of how well a model …

###### Fisher Information Approximation

FIA is a method to describe the minimum description length ( MDL Minimum Description Length MDL is …

###### Kolmogorov Complexity

Description of Data The measurement of complexity is based on the observation that the …