# Combinations

Published: 2019-04-07

Category: { Math }

Tags:

Summary: Choose X from N is
$$ C_N^X = \frac{N!}{ X! (N-X)! } $$

Pages: 24

# Term Frequency - Inverse Document Frequency

Published: 2019-05-06

Category: { Math }

Tags:

Summary:

Pages: 24

# Jaccard Similarity

Published: 2019-05-06

Category: { Math }

Tags:

References:
- Jaccard index

Summary: Jaccard index is the ratio of the size of the intersect of the set and the size of the union of the set.
$$ J(A, B) = \frac{ \vert A \cap B \vert }{ \vert A \cup B \vert } $$
Jaccard distance $d_J(A,B)$ is defined as
$$ d_J(A,B) = 1 - J(A,B). $$
Properties If the two sets are the same, $A=B$, we have $J(A,B)=1$ or $d_J(A,B)=0$. We have maximum similarity.
If the two sets have nothing in common, we have $J(A,B)=0$ or $d_J(A,B)=1$. We have minimum similarity.
Examples Sentence One Word Set: (( sentenceOneWords )) Sentence Two Word Set: (( sentenceTwoWords )) Intersect: (( intersectWords )) Union: (( unionWords )) Jaccard Index: (( jaccardIndex )) Jaccard Distance: (( jaccardDistance ))

Pages: 24

# Eigenvalues and Eigenvectors

Published: 2019-05-06

Category: { Math }

Tags:

References:
- Eigenvectors and Eigenvalues @ Explained Visually

Summary: To find the eigenvectors $\mathbf x$ of a matrix $\mathbf A$, we construct the eigen equation
$$ \mathbf A \mathbf x = \lambda \mathbf x, $$
where $\lambda$ is the eigenvalue.
We rewrite it in the components form,
$$ \begin{equation} A_{ij} x_j = \lambda x_i. \label{eqn-eigen-decomp-def} \end{equation} $$
Mathematically speaking, it is straightforward to find the eigenvectors and eigenvalues.
Eigenvectors are Special Directions Judging from the definition in Eq.($\ref{eqn-eigen-decomp-def}$), the eigenvectors do not change direction under the operation of the matrix $\mathbf A$.
Reconstruct $\mathbf A$ We can reconstruct $\mathbf A$ using the eigenvalues and eigenvectors.
First of all, we will construct a matrix of eigenvectors,

Pages: 24

# Cosine Similarity

Published: 2019-05-06

Category: { Math }

Tags:

References:
- Cosine Similarity

Summary: As simple as the inner product of two vectors
$$ d_{cos} = \frac{\vec A}{\vert \vec A \vert} \cdot \frac{\vec B }{ \vert \vec B \vert} $$
Examples To use cosine similarity, we have to vectorize the words first. There are many different methods to achieve this. For the purpose of illustrating cosine similarity, we use term frequency.
Term frequency is the occurrence of the words. We do not deal with duplications so duplicate words will have some effect on the similarity.
In principle, we could also use word set for a sentence to remove the effect of duplicate words. In most cases, if a word is repeating, it would indeed make the sentences different.

Pages: 24

# n-gram

Published: 2019-05-19

Category: { Math }

Tags:

References:
- words/n-gram

Summary: n-gram is a method to split words into set of substring elements so that those can be used to match words.
Examples Use the following examples to get your first idea about it. I created two columns so that we could compare the n-grams of two different words side-by-side.
n in n-gram is Word One Clean Word: (( sentenceOneWords )) n-grams: (( sentenceOneWordsnGram )) Word Two Clean Word: (( sentenceTwoWords )) n-grams: (( sentenceTwoWordsnGram ))

Pages: 24

# Levenshtein Distance

Published: 2019-05-19

Category: { Math }

Tags:

References:
- levenshtein-distance @ trekhleb/javascript-algorithms

Summary: Levenshtein distance calculates the number of operations needed to change one word to another by applying single-character edits (insertions, deletions or substitutions).
The reference explains this concept very well. For consistency, I extracted a paragraph from it which explains the operations in Levenshtein algorithm. The source of the following paragraph is the first reference of this article.
Levenshtein Matrix
Cell (0:1) contains red number 1. It means that we need 1 operation to transform M to an empty string. And it is by deleting M. This is why this number is red. Cell (0:2) contains red number 2. It means that we need 2 operations to transform ME to an empty string.

Pages: 24

# Frobenius distance

Published: 2019-06-17

Category: { Math }

Tags:

References:
- Weisstein. Frobenius Norm. [cited 8 Nov 2021]. Available: https://mathworld.wolfram.com/FrobeniusNorm.html

Summary: Frobenius distance between the matrix $X_{n}^{\phantom{n}k}$ and $H_n^{\phantom{n}r} W_r^{\phantom{r}k}$,
$$ \lVert X_{n}^{\phantom{n}k} - H_n^{\phantom{n}r} W_r^{\phantom{r}k} \rVert^2 \equiv \sum_{n,k} (X_{n}^{\phantom{n}k} - H_n^{\phantom{n}r} W_r^{\phantom{r}k})^2. $$

Pages: 24

# Tucker Decomposition

Published: 2019-06-18

Category: { Math }

Tags:

Summary: Tucker decomposition of a generalization of SVD to higher ranks

Pages: 24

# SVD: Singular Value Decomposition

Published: 2019-06-18

Category: { Math }

Tags:

Summary: Given a matrix $\mathbf X \to X_{m}^{\phantom{m}n}$, we can decompose it into three matrices
$$ X_{m}^{\phantom{m}n} = U_{m}^{\phantom{m}k} D_{k}^{\phantom{k}l} (V_{n}^{\phantom{n}l} )^{\mathrm T}, $$
where $D_{k}^{\phantom{k}l}$ is diagonal.
Here we have $\mathbf U$ being constructed by the eigenvectors of $\mathbf X \mathbf X^{\mathrm T}$, while $\mathbf V$ is being constructed by the eigenvectors of $\mathbf X^{\mathrm T} \mathbf X$ (which is also the reason we keep the transpose).
I find this slide from Christoph Freudenthaler very useful. The original slide has been added as a reference to this article.
SVD visualized by Christoph Freudenthaler

Pages: 24

# Modes and Slices of Tensors

Published: 2019-06-18

Category: { Math }

Tags:

Summary: Simple decomposition of tensors

Pages: 24

# Khatri-Rao Product

Published: 2019-06-18

Category: { Math }

Tags:

References:
- Kronecker product

Summary: $$ \mathbf{A} \ast \mathbf{B} = \left(\mathbf{A}_{ij} \otimes \mathbf{B}_{ij}\right)_{ij} $$

Pages: 24

# Cholesky Decomposition

Published: 2019-06-18

Category: { Math }

Tags:

Summary: Decomposing a matrix into two

Pages: 24

# Canonical Decomposition

Published: 2019-06-18

Category: { Math }

Tags:

Summary: Canonical decomposition

Pages: 24

# Mahalanobis Distance

Published: 2020-03-11

Category: { Math }

Tags:

References:
- Mahalanobis distance @ Wikipedia

Summary: Distance between a point and a distribution by measuring the distance between the point and the mean of the distribution using the coordinate system defined by the principal components.

Pages: 24

# Diagnolize Matrices

Published: 2020-03-11

Category: { Math }

Tags:

Summary: Diagnolizing a matrix is a transformation using its eigen space.

Pages: 24

# Multiset, mset or bag

Published: 2020-12-27

Category: { Math }

Tags:

References:
- Multiset @ Wikipedia

Summary: A bag is a set in which duplicate elements are allowed.
An ordered bag is a list that we use in programming.

Pages: 24

# Jensen's Inequality

Published: 2021-04-12

Category: { math }

Tags:

References:
- Jensen's Inequality @ Wikipedia

Summary: Jensen’s inequality shows that
$$ f(\mathbb E(X)) \leq \mathbb E(f(X)) $$
for a concave function $f(\cdot)$.

Pages: 24

# Gaussian Integrals

Published: 2021-05-11

Category: { Math }

Tags:

References:
- reference for multidimensional gaussian integral

Summary: Gaussian integral is one of the most useful things if one could write it down.

Pages: 24

# The Hubbard-Stratonovich Identity

Published: 2021-06-17

Category: { Math }

Tags:

References:
- Hubbard J. Calculation of Partition Functions. Physical Review Letters. 1959. pp. 77–78. doi:10.1103/physrevlett.3.77

Summary: Very useful in calculating the partition function

Pages: 24