Planted
Sep 22, 2023 Last tended
Nov 12, 2023

Confusion Matrix

A confusion matrix is an NxN table that summarizes the performance of a classification model. The rows of the table represent actual classes, and the columns represent predicted classes. Each cell corresponds to a possible outcome and represents the number of examples with that outcome.

For example, the confusion matrix of a binary classification model is a 2x2 table. It has rows representing actual positive and negative classes and columns representing predicted positive and negative classes.

The four possible outcomes in binary classification are called:

true positive: correctly classifying a positive example
false positive: misclassifying a negative example as positive, also known as a “type I error”
true negative: correctly classifying a negative example
false negative: misclassifying a positive example as negative, also known as a “type II error”

We can use these values to calculate various performance metrics such as accuracy, precision, recall, and more!

Computing confusion matrices in scikit-learn

import numpy as np
from sklearn.metrics import confusion_matrix

y_true = np.array([1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1])
y_pred = np.array([1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1])

matrix = confusion_matrix(y_true, y_pred, labels=[1, 0])
tp, fn, fp, tn = matrix.ravel()

print("Confusion matrix:", matrix, sep="\n", end="\n\n")
print("True positive:", tp)
print("False negative:", fn)
print("False positive:", fp)
print("True negative:", tn)

Confusion matrix: [[9 2] [0 9]] True positive: 9 False negative: 2 False positive: 0 True negative: 9

This page references the following sources: