close

[Solved] AttributeError: ‘DataFrame’ object has no attribute ‘map’

Hello Guys, How are you all? Hope You all Are Fine. Today I get the following error AttributeError: ‘DataFrame’ object has no attribute ‘map’ in python. So Here I am Explain to you all the possible solutions here.

Without wasting your time, Let’s start This Article to Solve This Error.

How AttributeError: ‘DataFrame’ object has no attribute ‘map’ Error Occurs?

Today I get the following error AttributeError: ‘DataFrame’ object has no attribute ‘map’ in python.

How To Solve AttributeError: ‘DataFrame’ object has no attribute ‘map’ Error ?

  1. How To Solve AttributeError: 'DataFrame' object has no attribute 'map' Error ?

    To Solve AttributeError: 'DataFrame' object has no attribute 'map' Error You can't map a dataframe, but you can convert the dataframe to an RDD and map that by doing spark_df.rdd.map(). Prior to Spark 2.0, spark_df.map would alias to spark_df.rdd.map().

  2. AttributeError: 'DataFrame' object has no attribute 'map'

    To Solve AttributeError: 'DataFrame' object has no attribute 'map' Error You can't map a dataframe, but you can convert the dataframe to an RDD and map that by doing spark_df.rdd.map(). Prior to Spark 2.0, spark_df.map would alias to spark_df.rdd.map().

Solution 1

You can’t map a dataframe, but you can convert the dataframe to an RDD and map that by doing spark_df.rdd.map(). Prior to Spark 2.0, spark_df.map would alias to spark_df.rdd.map(). With Spark 2.0, you must explicitly call .rdd first.

Solution 2

You can use df.rdd.map(), as DataFrame does not have map or flatMap, but be aware of the implications of using df.rdd:

Converting to RDD breaks Dataframe lineage, there is no predicate pushdown, no column prunning, no SQL plan and less efficient PySpark transformations.

What should you do instead?

Keep in mind that the high-level DataFrame API is equipped with many alternatives. First, you can use select or selectExpr.

Another example is using explode instead of flatMap(which existed in RDD):

df.select($"name",explode($"knownLanguages"))
    .show(false)

Result:

+-------+------+
|name   |col   |
+-------+------+
|James  |Java  |
|James  |Scala |
|Michael|Spark |
|Michael|Java  |
|Michael|null  |
|Robert |CSharp|
|Robert |      |
+-------+------+

You can also use withColumn or UDF, depending on the use-case, or another option in the DataFrame API.

Summery

It’s all About this issue. Hope all solution helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which solution worked for you? Thank You.

Also, Read