Confusion Matrix
A confusion matrix is an NxN table that summarizes the performance of a classification model. The rows of the table represent actual classes, and the columns represent predicted classes. Each cell corresponds to a possible outcome and represents the number of examples with that outcome.
For example, the confusion matrix of a binary classification model is a 2x2 table. It has rows representing actual positive and negative classes and columns representing predicted positive and negative classes.
The four possible outcomes in binary classification are called:
- true positive: correctly classifying a positive example
- false positive: misclassifying a negative example as positive, also known as a βtype I errorβ
- true negative: correctly classifying a negative example
- false negative: misclassifying a positive example as negative, also known as a βtype II errorβ
We can use these values to calculate various performance metrics such as accuracy, precision, recall, and more!
Computing confusion matrices in scikit-learn
import numpy as np
from sklearn.metrics import confusion_matrix
y_true = np.array([1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1])
y_pred = np.array([1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1])
matrix = confusion_matrix(y_true, y_pred, labels=[1, 0])
tp, fn, fp, tn = matrix.ravel()
print("Confusion matrix:", matrix, sep="\n", end="\n\n")
print("True positive:", tp)
print("False negative:", fn)
print("False positive:", fp)
print("True negative:", tn)
This page references the following sources: