close

how do I remove rows with duplicate values of columns in pandas data frame?

Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about how do I remove rows with duplicate values of columns in pandas data frame in Python. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

how do I remove rows with duplicate values of columns in pandas data frame?

  1. how do I remove rows with duplicate values of columns in pandas data frame?

    Using drop_duplicates with subset with list of columns to check for duplicates on and keep='first' to keep first of duplicates.

  2. I remove rows with duplicate values of columns in pandas data frame

    Using drop_duplicates with subset with list of columns to check for duplicates on and keep='first' to keep first of duplicates.

Method 1

Using drop_duplicates with subset with list of columns to check for duplicates on and keep='first' to keep first of duplicates.

If dataframe is:

df = pd.DataFrame({'Column1': ["'cat'", "'toy'", "'cat'"],
                   'Column2': ["'bat'", "'flower'", "'bat'"],
                   'Column3': ["'xyz'", "'abc'", "'lmn'"]})
print(df)

Result:

  Column1   Column2 Column3
0   'cat'     'bat'   'xyz'
1   'toy'  'flower'   'abc'
2   'cat'     'bat'   'lmn'

Then:

result_df = df.drop_duplicates(subset=['Column1', 'Column2'], keep='first')
print(result_df)

Result:

  Column1   Column2 Column3
0   'cat'     'bat'   'xyz'
1   'toy'  'flower'   'abc'

Method 2

import pandas as pd

df = pd.DataFrame({"Column1":["cat", "dog", "cat"],
                    "Column2":[1,1,1],
                    "Column3":["C","A","B"]})

df = df.drop_duplicates(subset=['Column1'], keep='first')
print(df)

Summery

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

Also, Read