Confusion Matrix

A confusion matrix is an NxN table that summarizes the performance of a classification model. The rows of the table represent actual classes, and the columns represent predicted classes. Each cell corresponds to a possible outcome and represents the number of examples with that outcome.

For example, the confusion matrix of a binary classification model is a 2x2 table. It has rows representing actual positive and negative classes and columns representing predicted positive and negative classes.

Example Confusion Matrix

The four possible outcomes in binary classification are called:

  • true positive: correctly classifying a positive example
  • false positive: misclassifying a negative example as positive, also known as a β€œtype I error”
  • true negative: correctly classifying a negative example
  • false negative: misclassifying a positive example as negative, also known as a β€œtype II error”

We can use these values to calculate various performance metrics such as accuracy, precision, recall, and more!

Computing confusion matrices in scikit-learn

import numpy as np
from sklearn.metrics import confusion_matrix

y_true = np.array([1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1])
y_pred = np.array([1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1])

matrix = confusion_matrix(y_true, y_pred, labels=[1, 0])
tp, fn, fp, tn = matrix.ravel()

print("Confusion matrix:", matrix, sep="\n", end="\n\n")
print("True positive:", tp)
print("False negative:", fn)
print("False positive:", fp)
print("True negative:", tn)
Confusion matrix: [[9 2] [0 9]] True positive: 9 False negative: 2 False positive: 0 True negative: 9

This page references the following sources:

Here are all the notes in this garden, along with their links, visualized as a graph.