Linear Regression
Linear regression is a regression algorithm that computes a linear combination of input features plus a constant to predict a target value. In other words, linear regression works by drawing the best-fitting line through a dataset and then plugging an exampleβs feature values into the linear equation to make predictions.
We can represent a linear regression model with the following equation:
where:
-
Ε·
is the predicted value -
b
is a constant (called the bias term or the intercept) -
wi is the weight of the
i
th feature (also called a coefficient) -
xi is the value of the
i
th feature -
n
is the number of features
For example, we would define a model with one feature that has the parameters
b = 32
and w1 = 9β5 using the
following equation:
To predict the output of an example with the feature x1 = 10, we would calculate:
Linear regression in scikit-learn
import numpy as np
from sklearn.linear_model import LinearRegression
NUM_EXAMPLES = 150
rng = np.random.default_rng(seed=0)
noise = rng.normal(0, 3, NUM_EXAMPLES).reshape(-1, 1)
X = np.linspace(0, 50, NUM_EXAMPLES).reshape(-1, 1)
y = ((1.8 * X) + 32) + noise
reg = LinearRegression().fit(X, y)
b = reg.intercept_.item()
w_1 = reg.coef_.item()
x_1 = 10
pred = reg.predict(np.array([[x_1]])).item()
print(f"Ε· = b + w\u2081x\u2081 = {b} + ({w_1} * {x_1}) = {pred}")
Portions of this page are reproduced from work created and shared by Google and used according to terms described in the Creative Commons 4.0 Attribution License.
This page references the following sources:
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition (ch. 4)
- Google Developers Machine Learning Crash Course
- Google Machine Learning Glossary
- Machine Learning Refined blog (ch. 8)
- Wikipedia's "Linear regression" article
- Wikipedia's "Linear equation" article
- scikit-learn API reference