Cosine Similarity
As simple as the inner product of two vectors
$$ d_{cos} = \frac{\vec A}{\vert \vec A \vert} \cdot \frac{\vec B }{ \vert \vec B \vert} $$
Examples
To use cosine similarity, we have to vectorize the words first. There are many different methods to achieve this. For the purpose of illustrating cosine similarity, we use term frequency.
Term frequency is the occurrence of the words. We do not deal with duplications so duplicate words will have some effect on the similarity.
Word Set: (( sentenceOneWords ))
Word Set: (( sentenceTwoWords ))
Union as Vector Element Labels: (( unionWords ))
Sentence One Vector: (( sentenceOneVector ))
Sentence Two Vector: (( sentenceTwoVector ))
Cosine Similarity: (( cosineSimilarity ))
Planted:
by L Ma;
References:
Similar Articles:
cards/math/cosine-similarity
Links to:L Ma (2019). 'Cosine Similarity', Datumorphism, 05 April. Available at: https://datumorphism.leima.is/cards/math/cosine-similarity/.