# Negative Sampling

Knowledge of [[CBOW]] CBOW: Continuous Bag of Words Use the context to predict the center word or [[skipgram]] skipgram: Continuous skip-gram Use the center word to predict the context is required.

A naive model to train a model of words is to

- encode input words and output words using vectors,
- use the input word vector to predict the output word vector,
- calculate the errors between predicted output word vector and real output word vector,
- minimize the errors.

However, it is very expensive to project out the output words and calculate the error every time. A trick is to use **negative sampling**.

Negative sampling adds a new column to the data as the predictions.

Input (Center Word) | Output (Context) | Target (is Neighbour) |
---|---|---|

`intended` | `extravagant` | 1 |

`intended` | `display` | 1 |

`intended` | `to` | 1 |

`intended` | `attract` | 1 |

Now we have a problem. The target is always 1. This dataset might lead to network that outputs 1 all the time. We need some nagative samples to make it noisy. We randomly sampled words from the dictionary.

Input (Center Word) | Output (Context) | Target (is Neighbour) |
---|---|---|

`intended` | `extravagant` | 1 |

`intended` | `display` | 1 |

`intended` | `to` | 1 |

`intended` | `attract` | 1 |

`intended` | `I` | 0 |

`intended` | `a` | 0 |

`intended` | `intellect` | 0 |

`intended` | `mating` | 0 |

`intended` | `course` | 0 |

For more rigorous derivations, please follow Goldberg2014^{1}.

- mikolov2013 Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv:1301.3781
- Goldberg2014 Goldberg Y, Levy O. word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv [cs.CL]. 2014. Available: http://arxiv.org/abs/1402.3722
- The Illustrated Word2vec

`cards/machine-learning/embedding/negative-sampling`

:`cards/machine-learning/embedding/negative-sampling`

Links to:L Ma (2020). 'Negative Sampling', Datumorphism, 01 April. Available at: https://datumorphism.leima.is/cards/machine-learning/embedding/negative-sampling/.