Planted
Sep 23, 2023 Last tended
Nov 12, 2023

Receiver Operating Characteristic

A receiver operating characteristic (ROC) curve is a graph that visualizes the performance of a classification model at all possible classification thresholds. It plots a model’s true positive rate (recall) versus its false positive rate (the proportion of negative examples the model misclassified as positive).

The area under the ROC curve (ROC AUC) is a metric used to evaluate how well a model classifies examples. The ROC AUC ranges from 0 (the worst performance) to 1 (the best performance). A model that predicts a class at random would have an ROC AUC of 0.5

Computing ROC AUCs in scikit-learn

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split

X, y = make_classification(n_samples=200, random_state=4)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=1)

clf = LogisticRegression().fit(X_train, y_train)
y_score = clf.decision_function(X_test)
roc_auc = roc_auc_score(y_test, y_score)

print(f"ROC AUC: {roc_auc:.2f}")

ROC AUC: 0.93

This page references the following sources: