Goodness-of-fit

#Model Selection

Does the data agree with the model?

Calculate the distance between data and model predictions.
Apply Bayesian methods such as likelihood estimation: likelihood of observing the data if we assume the model; the results will be a set of fitting parameters.
…

Why don’t we always use goodness-of-fit as a measure of the goodness of a model?

We may experience overfitting.
The model may not be intuitive.

This is why we would like to balance it with parsimony using some measures of generalizability.

K-means and overfitting

The overfitting problem is easily demonstrated using the K-means model.

Suppose we use $k=1$, i.e., considering only the data point and no neighbors, we will get a model that is 100% agreeing with the data. If we require only goodness of fit, we may as well choose this $k=1$ model. However, such a model is useless since it is the dataset itself without any other insights.

Planted: 2020-11-08 by L Ma;

References:

Vandekerckhove, J., & Matzke, D. (2015). Model comparison and the principle of parsimony. Oxford Library of Psychology.

Dynamic Backlinks to wiki/model-selection/goodness-of-fit:

Model Selection

Suppose we have a generating process that generates some numbers based on a distribution. Based on a …

Goodness-of-fit

Does the data agree with the model? Calculate the distance between data and model predictions. Apply …

Parsimony of Models

For models with a lot of parameters, the goodness-of-fit is very likely to be very high. However, it …

MDL and Neural Networks

Minimum Description Length ( [[MDL]] Minimum Description Length MDL is a measure of how well a model …

Akaike Information Criterion

Suppose we have a model that describes the data generation process behind a dataset. The …

Bayesian Information Criterion

BIC considers the number of parameters and the total number of data records.

wiki/model-selection/goodness-of-fit Links to:

Model Selection

Suppose we have a generating process that generates some numbers based on a distribution. Based on a …

Parsimony of Models

For models with a lot of parameters, the goodness-of-fit is very likely to be very high. However, it …

Measures of Generalizability

To measure the generalization, we define a generalization error, $$ \begin{align} \mathcal G = …

Likelihood is not necessarily a pdf

L Ma (2020). 'Goodness-of-fit', Datumorphism, 11 April. Available at: https://datumorphism.leima.is/wiki/model-selection/goodness-of-fit/.