close

how to use pandas isin for multiple columns

Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about how to use pandas isin for multiple columns in Python. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

how to use pandas isin for multiple columns?

  1. how to use pandas isin for multiple columns?

    The purpose of the reset_index and set_index calls are to preserve df2's index as in the desired result you posted.

  2. use pandas isin for multiple columns

    The purpose of the reset_index and set_index calls are to preserve df2's index as in the desired result you posted.

Method 1

Perform an inner merge on col1 and col2:

import pandas as pd
df1 = pd.DataFrame({'col1': ['pizza', 'hamburger', 'hamburger', 'pizza', 'ice cream'], 'col2': ['boy', 'boy', 'girl', 'girl', 'boy']}, index=range(1,6))
df2 = pd.DataFrame({'col1': ['pizza', 'pizza', 'chicken', 'cake', 'cake', 'chicken', 'ice cream'], 'col2': ['boy', 'girl', 'girl', 'boy', 'girl', 'boy', 'boy']}, index=range(10,17))

print(pd.merge(df2.reset_index(), df1, how='inner').set_index('index'))

yields

            col1  col2
index                 
10         pizza   boy
11         pizza  girl
16     ice cream   boy

The purpose of the reset_index and set_index calls are to preserve df2‘s index as in the desired result you posted. If the index is not important, then

pd.merge(df2, df1, how='inner')
#         col1  col2
# 0      pizza   boy
# 1      pizza  girl
# 2  ice cream   boy

would suffice.


Alternatively, you could construct MultiIndexs out of the col1 and col2 columns, and then call the MultiIndex.isin method:

index1 = pd.MultiIndex.from_arrays([df1[col] for col in ['col1', 'col2']])
index2 = pd.MultiIndex.from_arrays([df2[col] for col in ['col1', 'col2']])
print(df2.loc[index2.isin(index1)])

yields

         col1  col2
10      pizza   boy
11      pizza  girl
16  ice cream   boy

Method 2

Thank you unutbu! Here is a little update.

import pandas as pd
df1 = pd.DataFrame({'col1': ['pizza', 'hamburger', 'hamburger', 'pizza', 'ice cream'], 'col2': ['boy', 'boy', 'girl', 'girl', 'boy']}, index=range(1,6))
df2 = pd.DataFrame({'col1': ['pizza', 'pizza', 'chicken', 'cake', 'cake', 'chicken', 'ice cream'], 'col2': ['boy', 'girl', 'girl', 'boy', 'girl', 'boy', 'boy']}, index=range(10,17))
df1[df1.set_index(['col1','col2']).index.isin(df2.set_index(['col1','col2']).index)]

return:

    col1    col2
1   pizza   boy
4   pizza   girl
5   ice cream   boy

Summery

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

Also, Read