Model Selection

A good model selection process selects a good model for us. What is a good model? How do we quantify it?

5 MDL and Neural Networks

Published:
Category: { Model Selection }
Summary: Minimum Description Length ( MDL Minimum Description Length MDL is a measure of how well a model compresses data by minimizing the combined cost of the description of the model and the misfit. ) can be used to construct a concise network. A fully connected network has great expressing power but it is easily overfitting. One strategy is to apply constraints to the networks: Limit the connections; Shared weights in subgroups of the network; Constrain the weights using some probability distributions.
Pages: 5

4 Parsimony of Models

Published:
References: - Vandekerckhove, J., & Matzke, D. (2015). Model comparison and the principle of parsimony. Oxford Library of Psychology.
Summary: For models with a lot of parameters, the goodness-of-fit is very likely to be very high. However, it is also likely to generalize bad. So we need measure of generalizability Here parsinomy gives us a few advantages. easy to perceive better generalizations
Pages: 5

3 Measures of Generalizability

Published:
Category: { Model Selection }
Summary: To measure the generalization, we define a generalization error, $$ \begin{align} \mathcal G = \mathcal L_{P}(\hat f) - \mathcal L_E(\hat f), \end{align} $$ where $\mathcal L_{P}$ is the population loss, $\mathcal L_E$ is the empirical loss, and $\hat f$ is our model by minimizing the empirical loss. However, we do not know the actual joint probability $p(x, y)$ of our dataset $\{x_i, y_i\}$. Thus the population loss is not known. In machine learning, we usually use cross validation Cross Validation Cross validation is a method to estimate the risk The Learning Problem The learning problem posed by Vapnik:1 Given a sample: $\{z_i\}$ in the probability space $Z$; Assuming a probability measure on the probability space $Z$; Assuming a set of functions $Q(z, \alpha)$ (e.
Pages: 5

2 Goodness-of-fit

Published:
Category: { Model Selection }
Summary: Does the data agree with the model? Calculate the distance between data and model predictions. Apply Bayesian methods such as likelihood estimation: likelihood of observing the data if we assume the model; the results will be a set of fitting parameters. … Why don’t we always use goodness-of-fit as a measure of the goodness of a model? We may experience overfitting. The model may not be intuitive. This is why we would like to balance it with parsimony using some measures of generalizability.
Pages: 5

1 Model Selection

Published:
Category: { Model Selection }
Summary: Suppose we have a generating process that generates some numbers based on a distribution. Based on a data sample, we could reconstruct some sort of theoretical models to represent the actual generating process. Which is a Good Model? (1)The black curve represent the generating process. The red rectangle is a very simple model that captures some major samples. The blue step-wise model is capturing more sample data but with more parameters.
Pages: 5