Hilbert-Schmidt Independence Criterion (HSIC)

Given two kernels of the feature representations K=k(x,x) and L=l(y,y), HSIC is defined as12

HSIC(K,L)=1(n1)2tr(KHLH),

where

  • x, y are the representations of features,
  • n is the dimension of the representation of the features,
  • H is the so-called [[centering matrix]] Centering Matrix Useful when centering a vector around its mean .

We can choose different kernel functions k and l. For example, if k and l are linear kernels, we have k(x,y)=l(x,y)=xy. In this linear case, HSIC is simply cov(xT,yT)Frobenius2.


  1. Gretton A, Bousquet O, Smola A, Schölkopf B. Measuring Statistical Dependence with Hilbert-Schmidt Norms. Algorithmic Learning Theory. Springer Berlin Heidelberg; 2005. pp. 63–77. doi:10.1007/11564089_7 ↩︎

  2. Kornblith S, Norouzi M, Lee H, Hinton G. Similarity of Neural Network Representations Revisited. arXiv [cs.LG]. 2019. Available: http://arxiv.org/abs/1905.00414 ↩︎

Planted: by ;

Dynamic Backlinks to cards/machine-learning/measurement/hilbert-schmidt-independence-criterion:
cards/machine-learning/measurement/hilbert-schmidt-independence-criterion Links to:

L Ma (2021). 'Hilbert-Schmidt Independence Criterion (HSIC)', Datumorphism, 11 April. Available at: https://datumorphism.leima.is/cards/machine-learning/measurement/hilbert-schmidt-independence-criterion/.