Planted
Sep 27, 2023 Last tended
Nov 12, 2023

Linear Regression

Linear regression is a regression algorithm that computes a linear combination of input features plus a constant to predict a target value. In other words, linear regression works by drawing the best-fitting line through a dataset and then plugging an example’s feature values into the linear equation to make predictions.

We can represent a linear regression model with the following equation:

\hat{y} = b + w_{1} x_{1} + \cdot \cdot \cdot + w_{n} x_{n}

where:

ŷ is the predicted value
b is a constant (called the bias term or the intercept)
w_i is the weight of the ith feature (also called a coefficient)
x_i is the value of the ith feature
n is the number of features

For example, we would define a model with one feature that has the parameters b = 32 and w₁ = 9⁄5 using the following equation:

\hat{y} = 32 + \frac{9}{5} x_{1}

To predict the output of an example with the feature x₁ = 10, we would calculate:

\hat{y} = 32 + \frac{9}{5} (10) = 50

Linear regression in scikit-learn

import numpy as np
from sklearn.linear_model import LinearRegression

NUM_EXAMPLES = 150
rng = np.random.default_rng(seed=0)
noise = rng.normal(0, 3, NUM_EXAMPLES).reshape(-1, 1)
X = np.linspace(0, 50, NUM_EXAMPLES).reshape(-1, 1)
y = ((1.8 * X) + 32) + noise

reg = LinearRegression().fit(X, y)
b = reg.intercept_.item()
w_1 = reg.coef_.item()
x_1 = 10
pred = reg.predict(np.array([[x_1]])).item()

print(f"ŷ = b + w\u2081x\u2081 = {b} + ({w_1} * {x_1}) = {pred}")

ŷ = b + w₁x₁ = 32.182782160277036 + (1.8002511790032092 * 10) = 50.18529395030913

Portions of this page are reproduced from work created and shared by Google and used according to terms described in the Creative Commons 4.0 Attribution License.

This page references the following sources: