close

How to extract specific content in a pandas dataframe with a regex?

Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about How to extract specific content in a pandas dataframe with a regex in Python. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

How to extract specific content in a pandas dataframe with a regex?

  1. How to extract specific content in a pandas dataframe with a regex?

    You can try str.extract and strip, but better is use str.split, because in names of movies can be numbers too.

  2. extract specific content in a pandas dataframe with a regex

    You can try str.extract and strip, but better is use str.split, because in names of movies can be numbers too.

Method 1

You can try str.extract and strip, but better is use str.split, because in names of movies can be numbers too. Next solution is replace content of parentheses by regex and strip leading and trailing whitespaces:

#convert column to string
df['movie_title'] = df['movie_title'].astype(str)

#but it remove numbers in names of movies too
df['titles'] = df['movie_title'].str.extract('([a-zA-Z ]+)', expand=False).str.strip()
df['titles1'] = df['movie_title'].str.split('(', 1).str[0].str.strip()
df['titles2'] = df['movie_title'].str.replace(r'\([^)]*\)', '').str.strip()
print df
          movie_title      titles      titles1      titles2
0  Toy Story 2 (1995)   Toy Story  Toy Story 2  Toy Story 2
1    GoldenEye (1995)   GoldenEye    GoldenEye    GoldenEye
2   Four Rooms (1995)  Four Rooms   Four Rooms   Four Rooms
3   Get Shorty (1995)  Get Shorty   Get Shorty   Get Shorty
4      Copycat (1995)     Copycat      Copycat      Copycat

Method 2

I wanted to extract the text after the symbol “@” and before the symbol “.” (period) I tried this, it worked more or less because I have the symbol “@” but I don not want this symbol, anyway:

df['col'].astype(str).str.extract('(@.+.+)

Summery

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

Also, Read