# Model Comparison

## #AIC #BIC

The parsimony model comes from the idea of Occam’s razor: We choose the simple model that has more explanatory power.

The instance theory is a good model to explain the lexical decision task but it is not the only one. However, it simply makes it popular.

## What is a Good Model?

A good model should be presumably

- plausibility
- balance of parsimony and goodness-of-fit
- coherence of the underlying assumptions
- easy to understand when it breaks down

- consistency with known results
- especially with the simple and basic phenomena

- ability to explain rather than describe data
- extent to which model predictions can be falsified through experiments.

## How to choose a model?

It takes some thinking and calculations to choose a model.

- Model comparison (Logan does a great job in his paper about Instance Theory of Attention and Memory)
- Model selection
- Hypothesis testing

## Compare Models

Many methods deals with the balance between parsimony and goodness-of-fit

- Information criteria: AIC and BIC
- Minumum description length
- Bayes factors

### Information Criteria: IC

We calculate the IC of all the models at hand, and specify the delta

$$ \Delta _i = \mathrm{IC}_i - \operatorname{min} \mathrm{IC} $$

calculate the weights of models

$$ w_i = \frac{ \exp{-\Delta_i/2} }{ \sum_{m=1}^M \exp{-\Delta_m/2} } $$

We prefer the model with larger weight $w_i$.

If we use AIC for IC in the formula, this weight $w_i$ is called Akaike weight; If we use BIC, the weight $w_i$ is called Schwarz weight.

### MDL

Fisher Information Approximation is one of the methods to determine the minimum description length.

### Bayes Factors

## Table of Contents

**Current Ref:**

- wiki/model-selection/model-selection.md

**Links to:**

###### Measures of Generalizability

###### Goodness-of-fit

Is the data agree with the model? distance between data and model predictions likelihood function: …

###### Bayes Factors

$$ \frac{p(\mathscr M_1|y)}{ p(\mathscr M_2|y) } = \frac{p(\mathscr M_1)}{ p(\mathscr M_2) …

###### Akaike Information Criterion

Suppose we have a model that describes the data generation process behind a dataset. The …

###### Bayesian Information Criterion

BIC is Bayesian information criterion, it replaced the $+2k$ term in AIC with $k\ln n$ $$ …

###### Fisher Information Approximation

#FIA is a method to describe the [[minimum-description-length|minimum description length ( #MDL )]] …

###### Minimum Description Length

The minimum description length ( #MDL ) is based on the idea of compression of the data. MDL looks …