close

How to add a line of best fit to scatter plot

Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about How to add a line of best fit to scatter plot in Python. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

How to add a line of best fit to scatter plot?

  1. How to add a line of best fit to scatter plot?

    You can use np.polyfit() and np.poly1d(). Estimate a first degree polynomial using the same x values, and add to the ax object created by the .scatter() plot.

  2. add a line of best fit to scatter plot

    You can use np.polyfit() and np.poly1d(). Estimate a first degree polynomial using the same x values, and add to the ax object created by the .scatter() plot.

Method 1

You can do the whole fit and plot in one fell swoop with Seaborn.

import pandas as pd
import seaborn as sns
data_reduced= pd.read_csv('fake.txt',sep='\s+')
sns.regplot(data_reduced['2005'],data_reduced['2015'])
regressionplot

Method 2

You can use np.polyfit() and np.poly1d(). Estimate a first degree polynomial using the same x values, and add to the ax object created by the .scatter() plot. Using an example:

import numpy as np

     2005   2015
0   18882  21979
1    1161   1044
2     482    558
3    2105   2471
4     427   1467
5    2688   2964
6    1806   1865
7     711    738
8     928   1096
9    1084   1309
10    854    901
11    827   1210
12   5034   6253

Estimate first-degree polynomial:

z = np.polyfit(x=df.loc[:, 2005], y=df.loc[:, 2015], deg=1)
p = np.poly1d(z)
df['trendline'] = p(df.loc[:, 2005])

     2005   2015     trendline
0   18882  21979  21989.829486
1    1161   1044   1418.214712
2     482    558    629.990208
3    2105   2471   2514.067336
4     427   1467    566.142863
5    2688   2964   3190.849200
6    1806   1865   2166.969948
7     711    738    895.827339
8     928   1096   1147.734139
9    1084   1309   1328.828428
10    854    901   1061.830437
11    827   1210   1030.487195
12   5034   6253   5914.228708

and plot:

ax = df.plot.scatter(x=2005, y=2015)
df.set_index(2005, inplace=True)
df.trendline.sort_index(ascending=False).plot(ax=ax)
plt.gca().invert_xaxis()

To get:

enter image description here

Also provides the the line equation:

'y={0:.2f} x + {1:.2f}'.format(z[0],z[1])

y=1.16 x + 70.46

Conclusion

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

Also, Read