close

How to count outliers for all columns in Python?

Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about How to count outliers for all columns in Python in Python. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

How to count outliers for all columns in Python?

  1. How to count outliers for all columns in Python?

    Random data:
    np.random.seed(0) df = pd.DataFrame(np.random.randn(100, 5), columns=list('ABCDE')) df.iloc[::10] += np.random.randn() * 2 # this hopefully introduces some outliers df.head()

  2. count outliers for all columns in Python

    Random data:
    np.random.seed(0) df = pd.DataFrame(np.random.randn(100, 5), columns=list('ABCDE')) df.iloc[::10] += np.random.randn() * 2 # this hopefully introduces some outliers df.head()

Method 1

Random data:

np.random.seed(0)
df = pd.DataFrame(np.random.randn(100, 5), columns=list('ABCDE'))
df.iloc[::10] += np.random.randn() * 2  # this hopefully introduces some outliers
df.head()
Out: 
          A         B         C         D         E
0  2.529517  1.165622  1.744203  3.006358  2.633023
1 -0.977278  0.950088 -0.151357 -0.103219  0.410599
2  0.144044  1.454274  0.761038  0.121675  0.443863
3  0.333674  1.494079 -0.205158  0.313068 -0.854096
4 -2.552990  0.653619  0.864436 -0.742165  2.269755

Quartile calculations:

Q1 = df.quantile(0.25)
Q3 = df.quantile(0.75)
IQR = Q3 - Q1

And these are the numbers for each column:

((df < (Q1 - 1.5 * IQR)) | (df > (Q3 + 1.5 * IQR))).sum()
Out: 
A    1
B    0
C    0
D    1
E    2
dtype: int64

In line with seaborn’s calculations:

enter image description here

Note that the part before the sum ((df < (Q1 - 1.5 * IQR)) | (df > (Q3 + 1.5 * IQR))) is a boolean mask so you can use it directly to remove outliers. This sets them to NaN, for example:

mask = (df < (Q1 - 1.5 * IQR)) | (df > (Q3 + 1.5 * IQR))
df[mask] = np.nan

Conclusion

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

Also, Read