Simple linear regression uses only one input feature to predict the output.
- Example: Predicting a student’s salary based on their CGPA.
import pandas as pd
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
# Sample dataset
data = pd.DataFrame({'CGPA': [2.5, 3.0, 3.5, 4.0, 4.5],
'Salary': [25000, 30000, 35000, 40000, 45000]})
X = data[['CGPA']]
y = data['Salary']# Model training
model = LinearRegression()
model.fit(X, y)# Predict and plot
plt.scatter(X, y, color='blue')
plt.plot(X, model.predict(X), color='red')
plt.title('CGPA vs Salary')
plt.xlabel('CGPA')
plt.ylabel('Salary')
plt.show()
Multiple linear regression uses two or more input features to predict the output.
- Example: Predicting a house price based on size, number of rooms, and location.
# Sample dataset
data = pd.DataFrame({'Size': [1200, 1500, 1800, 2100, 2500],
'Rooms': [2, 3, 3, 4, 4],
'Price': [300000, 350000, 400000, 450000, 500000]})
X = data[['Size', 'Rooms']]
y = data['Price']
# Model training
model = LinearRegression()
model.fit(X, y)print('Intercept:', model.intercept_)
print('Coefficients:', model.coef_)
Regularization helps prevent overfitting by adding a penalty for large coefficients.
- Ridge Regression (L2): Adds squared magnitude of coefficients as penalty.
- Lasso Regression (L1): Adds absolute value of coefficients as penalty (can shrink coefficients to zero, performing feature selection).
from sklearn.linear_model import Ridge, Lasso
# Ridge Regression
ridge = Ridge(alpha=1.0)
ridge.fit(X, y)
print('Ridge Coefficients:', ridge.coef_)# Lasso Regression
lasso = Lasso(alpha=0.1)
lasso.fit(X, y)
print('Lasso Coefficients:', lasso.coef_)