Jaccard Similarity
Jaccard index is the ratio of the size of the intersect of the set and the size of the union of the set.
$$ J(A, B) = \frac{ \vert A \cap B \vert }{ \vert A \cup B \vert } $$
Jaccard distance $d_J(A,B)$ is defined as
$$ d_J(A,B) = 1 - J(A,B). $$
Properties
If the two sets are the same, $A=B$, we have $J(A,B)=1$ or $d_J(A,B)=0$. We have maximum similarity.
If the two sets have nothing in common, we have $J(A,B)=0$ or $d_J(A,B)=1$. We have minimum similarity.
Examples
Word Set: (( sentenceOneWords ))
Word Set: (( sentenceTwoWords ))
Intersect: (( intersectWords ))
Union: (( unionWords ))
Jaccard Index: (( jaccardIndex ))
Jaccard Distance: (( jaccardDistance ))
Planted:
by L Ma;
References:
Dynamic Backlinks to
cards/math/jaccard-similarity
:L Ma (2019). 'Jaccard Similarity', Datumorphism, 05 April. Available at: https://datumorphism.leima.is/cards/math/jaccard-similarity/.