Centered Kernel Alignment (CKA) is a similarity metric designed to measure the similarity of between representations of features in neural networks1.

## Definition of CKA

CKA is based on the .

Hilbert-Schmidt Independence Criterion (HSIC)

Given two kernels of the feature representations $K=k(x,x)$ and $L=l(y,y)$, HSIC is defined as

$$\operatorname{HSIC}(K, L) = \frac{1}{(n-1)^2} \operatorname{tr}( K H L H ),$$

where

• $x$, $y$ are the representations of features,
• $n$ is the dimension of the representation of the features,
• $H$ is the so-called .
We can choose different kernel functions $k$ and $l$. For example, if $k$ and $l$ are linear kernels, we have $k(x, y) = l(x, y) = x \cdot y$. In this linear case, HSIC is simply …

But HSIC is not invariant to isotropic scaling which is required for a similarity metric of representations1. CKA is a normalization of HSIC,

$$\operatorname{CKA}(K,L) = \frac{\operatorname{HSIC}(K, L)}{\sqrt{\operatorname{HSIC}(K,K) \operatorname{HSIC}(L,L)}}.$$