# Weisfeiler-Lehman Kernel

Published: 2021-09-25

Category: { Graph }

Tags:

References:
- Shervashidze N, Schweitzer P, van Leeuwen EJ, Mehlhorn K, Borgwardt KM. Weisfeiler-Lehman Graph Kernels. J Mach Learn Res. 2011;12: 2539–2561. Available: https://dl.acm.org/doi/10.5555/1953048.2078187
- Hamilton WL. Graph Representation Learning. Morgan & Claypool Publishers; 2020. pp. 1–159. doi:10.2200/S01045ED1V01Y202009AIM046

Summary: The Weisfeiler-Lehman kernel is an iterative integration of neighborhood information.
We initialize the labels for each node using its own node degree. At each step, we take the neighboring node degrees to form a [[multiset]] Multiset, mset or bag A bag is a set in which duplicate elements are allowed. An ordered bag is a list that we use in programming. . At step $K$, we have the multisets for each node. Those multisets at each node can be processed to form an representation of the graph which is in turn used to calculate statistics of the graph.
Iterate $k$ steps

Pages: 22

# Structural Equivalence on Graph

Published: 2021-09-25

Category: { Graph }

Tags:

Summary: Structural Equivalence means that nodes with similar neighborhood structures will share similar attributes.

Pages: 22

# Node Degree

Published: 2021-09-25

Category: { Graph }

Tags:

Summary: Node degree of a node $u$
$$ d_u = \sum_{v\in \mathcal V} A[u,v], $$
where $A$ is the adjacency matrix.

Pages: 22

# Homophily on Graph

Published: 2021-09-25

Category: { Graph }

Tags:

Summary: Homophily is the principle that a contact between similar people occurs at ahigher rate than among dissimilar people – McPherson20011
McPherson2001 McPherson M, Smith-Lovin L, Cook JM. Birds of a Feather: Homophily in Social Networks. Annu Rev Sociol. 2001;27: 415–444. doi:10.1146/annurev.soc.27.1.415 ↩︎

Pages: 22

# Heterophily on Graph

Published: 2021-09-25

Category: { Graph }

Tags:

Summary: Heterophily is the tendency to differ from others. Heterophily on a graph is the tendency to connect to nodes that are different from itself, e.g., nodes with different attributes have higher probability of edge.

Pages: 22

# Graph Laplacians

Published: 2021-09-25

Category: { Graph }

Tags:

References:
- Hamilton WL. Graph Representation Learning. Morgan & Claypool Publishers; 2020. pp. 1–159. doi:10.2200/S01045ED1V01Y202009AIM046
- Li J, Guo J-M, Shiu WC. Bounds on normalized Laplacian eigenvalues of graphs. J Inequal Appl. 2014;2014: 1–8. doi:10.1186/1029-242X-2014-316

Summary: Laplacian is a useful representation of graphs. The unnormalized Laplacian is
$$ \mathbf L = \mathbf D - \mathbf A, $$
where $\mathbf A$ is the [[adjacency matrix]] Graph Adjacency Matrix A graph $\mathcal G$ can be represented with an adjacency matrix $\mathbf A$. There are some nice and clear examples on wikipedia1, for example, $$ \begin{pmatrix} 2 & 1 & 0 & 0 & 1 & 0\\ 1 & 0 & 1 & 0 & 1 & 0\\ 0 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0 & 1 & 1\\ 1 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 \end{pmatrix} $$ for the graph Public Domain, Link and $\mathbf D$ is the degree matrix, i.

Pages: 22

# Graph Cuts

Published: 2021-09-25

Category: { Graph }

Tags:

Summary: Cut For a subset of nodes $\mathcal A\subset \mathcal V$, the rest of nodes can be denoted as $\bar {\mathcal A} = \mathcal V \setminus \mathcal A$. In other words, $\mathcal A \cup \bar {\mathcal A} = \mathcal V$ and $\mathcal A \cap \bar {\mathcal A} = \emptyset$. That being said, the nodes can be partitioned into two subsets, $\mathcal A$ and $\bar {\mathcal A}$. The cut of this partition is defined as the total number of edges between them,
$$ \operatorname{Cut} \left( \mathcal A, \bar{\mathcal A} \right) = \frac{1}{2} \left( \lvert (u, v)\in \mathcal E: u\in \mathcal A, v\in \bar{\mathcal A} \rvert + \lvert (u, v)\in \mathcal E: u\in \bar{\mathcal A}, v\in {\mathcal A} \rvert \right).

Pages: 22

# Graph Clustering Coefficient

Published: 2021-09-25

Category: { Graph }

Tags:

Summary: Local Clustering Coefficient $$ c_u = \frac{ \lvert (v_1,v_2)\in \mathcal E: v_1, v_2 \in \mathcal N(u) \rvert}{ \color{red}{d_n \choose 2} }, $$
where $\color{red}{d_n \choose 2}$ means all the possible combinations of neighbor nodes, and $\mathcal N(u)$ is the set of nodes that are neighbor to $u$.
Closed Triangles Ego Graph
Counting the closed triangles of the ego graph of a node and normalize it by the total possible number of triangles is also a measure of clustering coefficients.
If the ego graph of $u$ is fully connected, we have $c_u=1$; If the ego graph of $u$ is a star, we have $c_u=0$.

Pages: 22

# Graph Adjacency Matrix

Published: 2021-09-25

Category: { Graph }

Tags:

Summary: A graph $\mathcal G$ can be represented with an adjacency matrix $\mathbf A$. There are some nice and clear examples on wikipedia1, for example,
$$ \begin{pmatrix} 2 & 1 & 0 & 0 & 1 & 0\\ 1 & 0 & 1 & 0 & 1 & 0\\ 0 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0 & 1 & 1\\ 1 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 \end{pmatrix} $$
for the graph
Public Domain, Link

Pages: 22

# Eigenvector Centrality of a Graph

Published: 2021-09-25

Category: { Graph }

Tags:

Summary: Given a graph with adjacency matrix $\mathbf A$, the eigenvector centrality is
$$ \mathbf e_u = \frac{1}{\lambda} \sum_{v\in\mathcal V} \mathbf A[u,v] \mathbf e_v, \qquad \forall u \in \mathcal V. $$
Why is it called Eigenvector Centrality
The definition is equivalent to
$$ \lambda \mathbf e = \mathbf A\mathbf e. $$
Power Iteration The solution to $\mathbf e$ is the eigenvector that corresponds to the largest eigenvalue $\lambda_1$. Power iteration method can help us get this eigenvector, i.e., the $^{(t+1)}$ iteration is related to the previous iteration $^{(t)}$, through the following relation,
$$ \mathbf e^{(t+1)} = \mathbf A \mathbf e^{(t)}. $$

Pages: 22

# Betweenness Centrality of a Graph

Published: 2021-09-25

Category: { Graph }

Tags:

Summary: Betweenness centrality of a node $v$ is measurement of how likely the shortest path between two nodes $u_s$ and $u_t$ is gonna pass through node $v$,
$$ c(v) = \sum_{v\neq u_s\neq u_t} \frac{\sigma_{u_su_t}(v) }{\sigma_{u_su_t}}, $$
where $\sigma_{u_su_t}(v)$ is the number of shortest path between $u_s$ and $u_t$, and passing through $u$, while $\sigma_{u_su_t}$ is the number of shortest path between $u_s$ and $u_t$.
A figure from wikipedia demonstrates this idea well. The nodes on the outreach have smaller betweenness centrality, while the nodes in the core have higher betweenness centrality.
Source: Wikipedia
Outreach and Core
It is almost like cheating using the work “outreach” and “core” here.

Pages: 22

# Graph Local Overlap Measure: Sorensen Index

Published: 2021-09-26

Category: { Graph }

Tags:

Summary: The Sorensen index is
$$ \mathbf S_{\text{Sorensen}}[u,v] = \frac{ 2\lvert \mathcal N (u) \cap \mathcal N(v) \rvert }{ d_u + d_v}, $$
where $d_u$ is the node degree of node $u$ and $\mathcal N(u)$ is the neighbor nodes of $u$.

Pages: 22

# Graph Local Overlap Measure: Salton Index

Published: 2021-09-26

Category: { Graph }

Tags:

Summary: The Salton index is
$$ \mathbf S_{\text{Salton}}[u,v] = \frac{ 2\lvert \mathcal N (u) \cap \mathcal N(v) \rvert }{ \sqrt{d_u d_v}}, $$
where $d_u$ is the node degree of node $u$ and $\mathcal N(u)$ is the neighbor nodes of $u$.

Pages: 22

# Graph Local Overlap Measure: Resource Allocation Index

Published: 2021-09-26

Category: { Graph }

Tags:

Summary: The Resource Allocation (RA) index is
$$ \mathbf S_{\text{RA}}[v_1,v_2] = \sum_{u\in\mathcal N(u) \cap \mathcal N(v)} \frac{1}{d_u}, $$
where $d_u$ is the node degree of node $u$ and $\mathcal N(u)$ is the neighbor nodes of $u$.

Pages: 22

# Graph Local Overlap Measure: Adamic Adar Index

Published: 2021-09-26

Category: { Graph }

Tags:

References:
- Adamic LA, Adar E. Friends and neighbors on the Web. Soc Networks. 2003;25: 211–230. doi:10.1016/S0378-8733(03)00009-1
- Hamilton WL. Graph Representation Learning. Morgan & Claypool Publishers; 2020. pp. 1–159. doi:10.2200/S01045ED1V01Y202009AIM046

Summary: The Adamic Adar (AA) index is1
$$ \mathbf S_{\text{AA}}[v_1,v_2] = \sum_{u\in\mathcal N(u) \cap \mathcal N(v)} \frac{1}{\log d_u}, $$
where $d_u$ is the node degree of node $u$ and $\mathcal N(u)$ is the neighbor nodes of $u$.
If two nodes have shared neighbor, the degree of the neighbors will be at least 2. So it is safe to use $1/\log d_u$.
Adamic2003 Adamic LA, Adar E. Friends and neighbors on the Web. Soc Networks. 2003;25: 211–230. doi:10.1016/S0378-8733(03)00009-1 ↩︎

Pages: 22

# Graph Isomorphism

Published: 2021-09-26

Category: { Graph }

Tags:

Summary: For two graphs, $\mathcal G$ and $\mathcal H$, the two graphs are isomorphism on the following condition
$$ u, v \text{ adjacent in } G \iff u, v \text{ adjacent in } H. $$
An algorithm to find approximate isomorphism is the [[Weisfeiler Lehman Method]] Weisfeiler-Lehman Kernel The Weisfeiler-Lehman kernel is an iterative integration of neighborhood information. We initialize the labels for each node using its own node degree. At each step, we take the neighboring node degrees to form a [[multiset]] Multiset, mset or bag A bag is a set in which duplicate elements are allowed. An ordered bag is a list that we use in programming.

Pages: 22

# Graph Global Overlap Measure: Random Walk Similarity

Published: 2021-09-26

Category: { Graph }

Tags:

Summary: Random Walk Construct a stochastic transfer matrix $P$ by normalizing the adjacency matrix $\mathbf A$ using the node degrees of the target nodes,
$$ \mathbf P = \mathbf A \mathbf D^{-1}, $$
where $\mathbf A$ is the [[adjacency matrix]] Graph Adjacency Matrix A graph $\mathcal G$ can be represented with an adjacency matrix $\mathbf A$. There are some nice and clear examples on wikipedia1, for example, $$ \begin{pmatrix} 2 & 1 & 0 & 0 & 1 & 0\\ 1 & 0 & 1 & 0 & 1 & 0\\ 0 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0 & 1 & 1\\ 1 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 \end{pmatrix} $$ for the graph Public Domain, Link and $\mathbf D$ is a diagonalized matrix with the diagonal elements being the degrees.

Pages: 22

# Graph Global Overlap Measure: Leicht-Holme-Newman Index

Published: 2021-09-26

Category: { Graph }

Tags:

References:
- Lu L, Zhou T. Link Prediction in Complex Networks: A Survey. arXiv [physics.soc-ph]. 2010. Available: http://arxiv.org/abs/1010.0725
- Hamilton WL. Graph Representation Learning. Morgan & Claypool Publishers; 2020. pp. 1–159. doi:10.2200/S01045ED1V01Y202009AIM046

Summary: The LHN index is a normalized similarity index.
From Katz Index to LHN Index [[Katz Index]] Graph Global Overlap Measure: Katz Index The Katz index is $$ \mathbf S_{\text{Katz}}[u,v] = \sum_{i=1}^\infty \beta^i \mathbf A^i[u, v], $$ where $\mathbf A^i[u, v]$ is the matrix $\mathbf A$ to the $i$th power. Some for $\beta^i$. The Katz index describes the similarity between of node $u$ and node $v$. Do not confuse power with contravariant indices For readers familiar with tensor notations, it might be confusing. We some times use contravariant indices on the top right of the tensor notation. But here ${}^{i}$ means to the $i$th … has a knob to tune the punishment towards longer paths.

Pages: 22

# Graph Global Overlap Measure: Katz Index

Published: 2021-09-26

Category: { Graph }

Tags:

References:
- Katz L. A new status index derived from sociometric analysis. Psychometrika. 1953;18: 39–43. doi:10.1007/BF02289026
- Hamilton WL. Graph Representation Learning. Morgan & Claypool Publishers; 2020. pp. 1–159. doi:10.2200/S01045ED1V01Y202009AIM046

Summary: The Katz index is
$$ \mathbf S_{\text{Katz}}[u,v] = \sum_{i=1}^\infty \beta^i \mathbf A^i[u, v], $$
where $\mathbf A^i[u, v]$ is the matrix $\mathbf A$ to the $i$th power. Some for $\beta^i$. The Katz index describes the similarity between of node $u$ and node $v$.
Do not confuse power with contravariant indices
For readers familiar with tensor notations, it might be confusing. We some times use contravariant indices on the top right of the tensor notation.
But here ${}^{i}$ means to the $i$th power.
The index is proved to be the following
$$ \mathbf S_{\text{Katz}} = (\mathbf I - \beta \mathbf A)^{-1} - \mathbf I.

Pages: 22

# Graph Convolution Operator

Published: 2021-11-25

Category: { Graph }

Tags:

Summary: For a given graph $\mathcal G$, we have an attribute on each node, denoted as $f_v$. All the node attributes put together can be written as a list $\mathbf f\to (f_{v_1}, f_{v_2}, \cdots, f_{v_N})$.
Convolution on graph is combining attributes on nodes with their neighbors’. The [[adjacency matrix]] Graph Adjacency Matrix A graph $\mathcal G$ can be represented with an adjacency matrix $\mathbf A$. There are some nice and clear examples on wikipedia1, for example, $$ \begin{pmatrix} 2 & 1 & 0 & 0 & 1 & 0\\ 1 & 0 & 1 & 0 & 1 & 0\\ 0 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0 & 1 & 1\\ 1 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 \end{pmatrix} $$ for the graph Public Domain, Link $\mathbf A$ applied on all node attributes $\mathbf f$ is such an operation, i.

Pages: 22