## Least Square Regression Method for AI and ML

As Machine Learning and Artificial Intelligence become the backbone of today’s tech world, it is important to learn **popular methods** like “* Least Square Regression*” for understanding the math behind the

**regression analysis**along with the

*.*

**implementation practices with Python**In this blog, we are providing an in-depth knowledge of the **Least Square Regression** Method and Enroll in our AI Training in Chennai for the best practices.

## What is the Least Square Regression Method?

The least-square regression is the technique used in regression analysis of ML model and AI implementation.

It is one of the popular mathematical methods for finding the best possible fit line that defines the connection between dependent and independent variables.

## What is the Best Fit Line?

The best fit line is used to define the relationship between two or more variables that are drawn across a scatter plot of data points for representing them.

It is used in regression analysis for obtaining a definite relationship between the predictor variable and the target variable**.**

The least-square regression method is the effective one for drawing the line of best fit. As it minimizes the most possible extents, it is named as “* Least Square Regression*” method.

### Steps for calculating the line of the best fit

We require to get the basics for constructing the line that depicts the best relationship between variables in the data with the following equation.

y=mx+c

Here, y denotes the dependent variable, m denotes the slope of the line, x denotes the independent variable, and c denotes the y-intercept.

Step 1: Calculating the slope ‘m’ with the following formula

m=nxy-(x) (ynx2-(x)2

Step 2: Computing the y-intercept that is the value of y at the point in which the line crosses the y-axis using the following formula

c=y-mx

Step 3: Replacing the values in the following final equation

y=mx+c

### Example for Least Square Regression Method

Price of Cars in Rupees (x) | Number of Cars Sold (y) |

100000 | 12 |

154748 | 9 |

184587 | 7 |

214785 | 5 |

245870 | 2 |

Price of Cars in Rupees (x) | Number of Cars Sold (y) | Y = mx + c | Error |

100000 | 12 | 8.2 | 0.54 |

154748 | 9 | 6.7 | 0.89 |

184587 | 7 | 5.3 | -0.97 |

214785 | 5 | 4.3 | -0.68 |

245870 | 2 | 3.1 | -0.41 |

Now, it is easy to estimate how many cars with the possible price can you sell for maximizing sales.

Important things to have in mind before implementing the **Least Square Regression** method

- The data must be free of outliers
- The line of the best fit to be drawn iteratively
- If the method works for non-linear data
- Is this residual that denotes the error.

#### Least Square Regression Implementation in Python

Here, we are going to build a model for understanding the relationship between the head size and the brain weight of an individual by this python implementation of the **Least Square Regression** method.

We require data set that contains gender in binary values, age, head size in centimeters, and brain weight in grams.

Logic to be applied: To implement the linear regression for developing a model that shows the relationship between an independent and dependent variable.

**Step 1: Import the libraires**

import numpy as np

import pandas as pd

import matplotlib as plt

**Step 2: Import the required data set**

# Reading Data

data = pd.read_csv(‘C:UsersNeelTempDesktopheadbrain.csv’)

print(data.shape)

(237, 4)

print(data.head())

Gender Age Range Head Size(cm^3) Brain Weight(grams)

0 1 1 4512 1530

1 1 1 3738 1297

2 1 1 4261 1335

3 1 1 3777 1282

4 1 1 4177 1590

**Step 3: Assigning ‘x’ as independent variable and ‘y’ as dependent variable**

# Coomputing X and Y

X = data[‘Head Size(cm^3)’].values

Y = data[‘Brain Weight(grams)’].values

Further it must be proceeded with the below codes

# Mean X and Y

mean_x = np.mean(X)

mean_y = np.mean(Y)

# Total number of values

n = len(X)

**Step 4: Calculating the values of the slope and y-intercept**

# Using the formula to calculate ‘m’ and ‘c’

numer = 0

denom = 0

for i in range(n):

numer += (X[i] – mean_x) * (Y[i] – mean_y)

denom += (X[i] – mean_x) ** 2

m = numer / denom

c = mean_y – (m * mean_x)

# Printing coefficients

print(“Coefficients”)

print(m, c)

Coefficients

0.26342933948939945 325.57342104944223

**Step 5: Plotting the line of best fit**

# Plotting Values and Regression Line

max_x = np.max(X) + 100

min_x = np.min(X) – 100

# Calculating line values x and y

x = np.linspace(min_x, max_x, 1000)

y = c + m * x

# Ploting Line

plt.plot(x, y, color=’#58b970′, label=’Regression Line’)

# Ploting Scatter Points

plt.scatter(X, Y, c=’#ef5423′, label=’Scatter Plot’)

plt.xlabel(‘Head Size in cm3’)

plt.ylabel(‘Brain Weight in grams’)

plt.legend()

plt.show()

**Step 6: Evaluating the Model**

# Calculating Root Mean Squares Error

rmse = 0

for i in range(n):

y_pred = c + m * X[i]

rmse += (Y[i] – y_pred) ** 2

rmse = np.sqrt(rmse/n)

print(“RMSE”)

print(rmse)

RMSE

72.1206213783709

It can be implemented with another method called R-Squared in Python with the codes as follows

# Calculating R2 Score

ss_tot = 0

ss_res = 0

for i in range(n):

y_pred = c + m * X[i]

ss_tot += (Y[i] – mean_y) ** 2

ss_res += (Y[i] – y_pred) ** 2

r2 = 1 – (ss_res/ss_tot)

print(“R2 Score”)

print(r2)

R2 Score

0.6393117199570003

**Conclusion**

Finding the line of best fit with the **Least Square Regression** method has been explained here. We hope you understand this theoretically.

Join Softlogic Systems to gain hands-on exposure to the popular methods and get the best practices for implementing them in your desired programming languages.

Enroll today for the best Machine Learning Training in Chennai for your bright future.