Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about **How to calculate correlation between all columns and remove highly correlated ones using pandas** **in Python**. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

## How to calculate correlation between all columns and remove highly correlated ones using pandas?

**How to calculate correlation between all columns and remove highly correlated ones using pandas?**You can use the following for a given data frame df:

`corr_matrix = df.corr().abs() high_corr_var=np.where(corr_matrix>0.8)`

**calculate correlation between all columns and remove highly correlated ones using pandas**You can use the following for a given data frame df:

`corr_matrix = df.corr().abs() high_corr_var=np.where(corr_matrix>0.8)`

## Method 1

Here is the approach which I have used –

def correlation(dataset, threshold): col_corr = set() # Set of all the names of deleted columns corr_matrix = dataset.corr() for i in range(len(corr_matrix.columns)): for j in range(i): if (corr_matrix.iloc[i, j] >= threshold) and (corr_matrix.columns[j] not in col_corr): colname = corr_matrix.columns[i] # getting the name of column col_corr.add(colname) if colname in dataset.columns: del dataset[colname] # deleting the column from the dataset print(dataset)

## Method 2

You can use the following for a given data frame df:

corr_matrix = df.corr().abs() high_corr_var=np.where(corr_matrix>0.8) high_corr_var=[(corr_matrix.columns[x],corr_matrix.columns[y]) for x,y in zip(*high_corr_var) if x!=y and x<y]

**Conclusion**

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

**Also, Read**