Cross Entropy
Cross entropy is1
$$ H(p, q) = \mathbb E_{p} \left[ -\log q \right]. $$
Cross entropy $H(p, q)$ can also be decomposed,
$$ H(p, q) = H(p) + \operatorname{D}_{\mathrm{KL}} \left( p \parallel q \right), $$
where $H(p)$ is the [[entropy of $P$]] Shannon Entropy Shannon entropy $S$ is the expectation of information content $I(X)=-\log \left(p\right)$1, \begin{equation} H(p) = \mathbb E_{p}\left[ -\log \left(p\right) \right]. \end{equation} shannon_entropy_wiki Contributors to Wikimedia projects. Entropy (information theory). In: Wikipedia [Internet]. 29 Aug 2021 [cited 4 Sep 2021]. Available: https://en.wikipedia.org/wiki/Entropy_(information_theory) ↩︎ and $\operatorname{D}_{\mathrm{KL}}$ is the [[KL Divergence]] KL Divergence Kullback–Leibler divergence indicates the differences between two distributions .
Cross entropy is widely used in classification problems, e.g., [[logistic regression]] Logistic Regression logistics regression is a simple model for classification 2.
Binary Cross Entropy
For dataset with 2 classes ($0$ and $1$) in the target, we denote the true label probability is $p$, and the predicted probability is $q$. For example, $q_{y=1}$ denotes the probability of predicted label being $1$.
$$ \begin{align*} H(p, q) =& - p_{y=0} \log (q_{\hat y=0}) - p_{y=1} \log (q_{\hat y=1}) \\ =& - p_{y=0} \log (q_{\hat y=0}) - (1 - p_{y=0}) \log ( 1 - q_{\hat y=0} ) \end{align*} $$
For $y\in \{0,1\}$, we have
$$ H(p, q) = \begin{cases} - \log (q_{\hat y=0}) , & \text{for } y=0 \\ - \log ( 1 - q_{\hat y=0} ) , & \text{for } y=1. \end{cases} $$
Combining the two expressions, we can simply use the following formula,
$$ H(p, q) = - y \log (q_{\hat y=0}) - y \log ( 1 - q_{\hat y=0} ). $$
The two probabilities of $q_{\hat y=0}$ and $q_{\hat y=1}$ can be predicted by a model.
cross_entropy_wiki Contributors to Wikimedia projects. Cross entropy. In: Wikipedia [Internet]. 4 Jul 2021 [cited 4 Sep 2021]. Available: https://en.wikipedia.org/wiki/Cross_entropy ↩︎
Mehta2019 Mehta P, Wang C-H, Day AGR, Richardson C, Bukov M, Fisher CK, et al. A high-bias, low-variance introduction to Machine Learning for physicists. Phys Rep. 2019;810: 1–124. doi:10.1016/j.physrep.2019.03.001 ↩︎
- cross_entropy_wiki Contributors to Wikimedia projects. Cross entropy. In: Wikipedia [Internet]. 4 Jul 2021 [cited 4 Sep 2021]. Available: https://en.wikipedia.org/wiki/Cross_entropy
- Mehta2019 Mehta P, Wang C-H, Day AGR, Richardson C, Bukov M, Fisher CK, et al. A high-bias, low-variance introduction to Machine Learning for physicists. Phys Rep. 2019;810: 1–124. doi:10.1016/j.physrep.2019.03.001
cards/information/cross-entropy
:cards/information/cross-entropy
Links to:Lei Ma (2021). 'Cross Entropy', Datumorphism, 09 April. Available at: https://datumorphism.leima.is/cards/information/cross-entropy/.