# Mahalanobis Distance

Published:
Category: { Math }
Tags:
References:
Summary: Distance between a point and a distribution by measuring the distance between the point and the mean of the distribution using the coordinate system defined by the principal components.

# Diagnolize Matrices

Published:
Category: { Math }
Summary: Diagnolizing a matrix is a transformation using its eigen space.

# Tucker Decomposition

Published:
Category: { Math }
Summary: Tucker decomposition of a generalization of SVD to higher ranks

# SVD: Singular Value Decomposition

Published:
Category: { Math }
Summary: Given a matrix $\mathbf X \to X_{m}^{\phantom{m}n}$, we can decompose it into three matrices $$X_{m}^{\phantom{m}n} = U_{m}^{\phantom{m}k} D_{k}^{\phantom{k}l} (V_{n}^{\phantom{n}l} )^{\mathrm T},$$ where $D_{k}^{\phantom{k}l}$ is diagonal. Here we have $\mathbf U$ being constructed by the eigenvectors of $\mathbf X \mathbf X^{\mathrm T}$, while $\mathbf V$ is being constructed by the eigenvectors of $\mathbf X^{\mathrm T} \mathbf X$ (which is also the reason we keep the transpose). I find this slide from Christoph Freudenthaler very useful.

# Modes and Slices of Tensors

Published:
Category: { Math }
Tags:
Summary: Simple decomposition of tensors

# Khatri-Rao Product

Published:
Category: { Math }
Tags:
References:
Summary: Choose X from N is $$C_N^X = \frac{N!}{ X! (N-X)! }$$

# Canonical Decomposition

Published:
Category: { Math }
Summary: Canonical decomposition

# n-gram

Published:
Category: { Math }
Tags:
References:
Summary: n-gram is a method to split words into set of substring elements so that those can be used to match words. Examples Use the following examples to get your first idea about it. I created two columns so that we could compare the n-grams of two different words side-by-side. n in n-gram is Word One Clean Word: (( sentenceOneWords )) n-grams: (( sentenceOneWordsnGram )) Word Two Clean Word: (( sentenceTwoWords )) n-grams: (( sentenceTwoWordsnGram )) /*************************/ /** The function nGram is a copy of https://github.

# Levenshtein Distance

Published:
Category: { Math }
Tags:
Summary: Levenshtein distance calculates the number of operations needed to change one word to another by applying single-character edits (insertions, deletions or substitutions). The reference explains this concept very well. For consistency, I extracted a paragraph from it which explains the operations in Levenshtein algorithm. The source of the following paragraph is the first reference of this article. Levenshtein Matrix Cell (0:1) contains red number 1. It means that we need 1 operation to transform M to an empty string.

# Term Frequency - Inverse Document Frequency

Published:
Category: { Math }
Tags:
References:
Summary:

# Jaccard Similarity

Published:
Category: { Math }
Tags:
References:
Summary: Jaccard index is the ratio of the size of the intersect of the set and the size of the union of the set. $$J(A, B) = \frac{ \vert A \cap B \vert }{ \vert A \cup B \vert }$$ Jaccard distance $d_J(A,B)$ is defined as $$d_J(A,B) = 1 - J(A,B).$$ Properties If the two sets are the same, $A=B$, we have $J(A,B)=1$ or $d_J(A,B)=0$. We have maximum similarity.

# Eigenvalues and Eigenvectors

Published:
Category: { Math }
Tags:
References:
Summary: To find the eigenvectors $\mathbf x$ of a matrix $\mathbf A$, we construct the eigen equation $$\mathbf A \mathbf x = \lambda \mathbf x,$$ where $\lambda$ is the eigenvalue. We rewrite it in the components form, $$$$A_{ij} x_j = \lambda x_i. \label{eqn-eigen-decomp-def}$$$$ Mathematically speaking, it is straightforward to find the eigenvectors and eigenvalues. Eigenvectors are Special Directions Judging from the definition in Eq.($\ref{eqn-eigen-decomp-def}$), the eigenvectors do not change direction under the operation of the matrix $\mathbf A$.

# Cosine Similarity

Published:
Category: { Math }
Tags:
References:
Summary: As simple as the inner product of two vectors $$d_{cos} = \frac{\vec A}{\vert \vec A \vert} \cdot \frac{\vec B }{ \vert \vec B \vert}$$ Examples To use cosine similarity, we have to vectorize the words first. There are many different methods to achieve this. For the purpose of illustrating cosine similarity, we use term frequency. Term frequency is the occurrence of the words. We do not deal with duplications so duplicate words will have some effect on the similarity.

# Combinations

Published:
Category: { Math }
Tags:
Summary: Choose X from N is $$C_N^X = \frac{N!}{ X! (N-X)! }$$