Negative Sampling

#Word2vec

Knowledge of [[CBOW]] CBOW: Continuous Bag of Words Use the context to predict the center word or [[skipgram]] skipgram: Continuous skip-gram Use the center word to predict the context is required.

A naive model to train a model of words is to

encode input words and output words using vectors,
use the input word vector to predict the output word vector,
calculate the errors between predicted output word vector and real output word vector,
minimize the errors.

However, it is very expensive to project out the output words and calculate the error every time. A trick is to use negative sampling.

Negative sampling adds a new column to the data as the predictions.

Input (Center Word)	Output (Context)	Target (is Neighbour)
`intended`	`extravagant`	1
`intended`	`display`	1
`intended`	`to`	1
`intended`	`attract`	1

Now we have a problem. The target is always 1. This dataset might lead to network that outputs 1 all the time. We need some nagative samples to make it noisy. We randomly sampled words from the dictionary.

Input (Center Word)	Output (Context)	Target (is Neighbour)
`intended`	`extravagant`	1
`intended`	`display`	1
`intended`	`to`	1
`intended`	`attract`	1
`intended`	`I`	0
`intended`	`a`	0
`intended`	`intellect`	0
`intended`	`mating`	0
`intended`	`course`	0

For more rigorous derivations, please follow Goldberg2014¹.

Goldberg2014 Goldberg Y, Levy O. word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv [cs.CL]. 2014. Available: http://arxiv.org/abs/1402.3722 ↩︎

Planted: 2020-01-16 by L Ma;

References:

Dynamic Backlinks to cards/machine-learning/embedding/negative-sampling:

Word2vec

Single-layer neural network creates embedding space

cards/machine-learning/embedding/negative-sampling Links to:

CBOW: Continuous Bag of Words

Use the context to predict the center word

skipgram: Continuous skip-gram

Use the center word to predict the context

L Ma (2020). 'Negative Sampling', Datumorphism, 01 April. Available at: https://datumorphism.leima.is/cards/machine-learning/embedding/negative-sampling/.