# Conformal Prediction

Conformal prediction is a method to predict a consistent confidence interval in an on-line setting. The algorithms is following the [[Neyman-Pearson hypothesis testing]] Neyman-Pearson Theory The Neyman-Pearson hypothesis testing tests two hypothesis, hypothesis $H$, and an alternative hypothesis $H_A$. Neyman-Pearson Lemma The Neyman-Pearson Lemma is an very intuitive lemma to understand how to choose a hypothesis. The lecture notes from PennState is a very good read on this topic1. An example For simplicity, we assume that there exists a test statistic $T$ and $T$ can be used to measure how likely the hypothesis $H$ is true, e.g., the hypothesis $H$ is false, corresponds to $T$ … framework, thus providing solid theoretical support for the predicted region.

## Nonconformity Measure

For a new incoming data point $x$, and an existing [[multiset]] Multiset, mset or bag A bag is a set in which duplicate elements are allowed. An ordered bag is a list that we use in programming. of data points

$$ B_n = \{\!\{x_1, \cdots, x_{n}\}\!\}, $$

a nonconformity measure, $A(B_n, x)$, measures how different $x_n$ is from the existing multiset^{1}.

A naive choice would be to reuse our model $x = f(B_n)$. Define a nonconformity measure

$$ A(B_n, x) = d(f(B_n), x), $$

where $d$ is some kind of [[Distance]] Distance .

## A Conformal Algorithm Example

Given two existing data points

$$ B_3 = \{\{ x_1=1, x_2=4 \}\}, $$

we will decide if $x_3=2$ should be included in our predicted region of significance level $\alpha=0.05$.

To be able to test if $x_3=2$ falls inside the predicted region, we simply take the average of data points for the point prediction, and we use L1 norm as the nonconformity measure, i.e.,

$$ \begin{align} f(B_n) &= \frac{1}{n}\sum_i x_i \\ A(f(B_n), x) &= \lvert x - f(B_n) \rvert. \end{align} $$

With the model and nonconformity measure set, we can deicide whether the new data poin $x_3 = 2$ should be included in the predicted region of significance level $\alpha=0.05$.

For $i=1,2,3$,

calculate

$$ f_i = f(B_3\{\{x_i\}\}) $$

calculate the nonconformity measure

$$ \alpha_i = A(f_i, x_i) $$

With all $\alpha_i$, count all the data points that leads to $\alpha_i\geq \alpha_3$, denoted as $N_{\alpha_i\geq\alpha_3}$, calculate the probability

$$ p_x = \frac{N_{\alpha_i\geq\alpha_3}}{n}. $$

This algorithm is basically a hypothesis test procedure using the
[[Neyman-Pearson theory]]
Neyman-Pearson Theory
The Neyman-Pearson hypothesis testing tests two hypothesis, hypothesis $H$, and an alternative hypothesis $H_A$.
Neyman-Pearson Lemma The Neyman-Pearson Lemma is an very intuitive lemma to understand how to choose a hypothesis. The lecture notes from PennState is a very good read on this topic1.
An example For simplicity, we assume that there exists a test statistic $T$ and $T$ can be used to measure how likely the hypothesis $H$ is true, e.g., the hypothesis $H$ is false, corresponds to $T$ …
^{1}.

## Requires Exchangeability

The algorithm assumes the data points are exchangeabile as we assume no statistical difference between

$$ x_1, x_2, x_3, ..., x_n, $$

and

$$ x_2, x_3, ..., x_n, x_1 $$

and

$$ x_1, x_3, ..., x_n, x_2 $$

and all the possible permutations of this pattern. Obviously, i.i.d. data satisfies this condition. Note that i.i.d. is a more stringent condition^{1}.

## Bonferroni Correction

For predictions involves multiple variable, we should consider the [[Bonferroni Correction]] Bonferroni Correction Bonferroni correction is very useful in a multiple comparison problem .

L Ma (2022). 'Conformal Prediction', Datumorphism, 04 April. Available at: https://datumorphism.leima.is/wiki/statistical-estimation/conformal-prediction/.