close

How to create a sample single-column Spark DataFrame in Python?

Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about How to create a sample single-column Spark DataFrame in Python in Python. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

How to create a sample single-column Spark DataFrame in Python?

  1. How to create a sample single-column Spark DataFrame in Python?

    the following code is not working
    With single element you need a schema as type
    spark.createDataFrame(["10","11","13"], "string").toDF("age")

  2. create a sample single-column Spark DataFrame in Python

    the following code is not working
    With single element you need a schema as type
    spark.createDataFrame(["10","11","13"], "string").toDF("age")

Method 1

the following code is not working

With single element you need a schema as type

spark.createDataFrame(["10","11","13"], "string").toDF("age")

or DataType:

from pyspark.sql.types import StringType

spark.createDataFrame(["10","11","13"], StringType()).toDF("age")

With name elements should be tuples and schema as sequence:

spark.createDataFrame([("10", ), ("11", ), ("13",  )], ["age"])

Method 2

Well .. There is some pretty easy method for creating sample dataframe in PySpark

>>> df = sc.parallelize([[1,2,3], [2,3,4]]).toDF()
>>> df.show()
+---+---+---+
| _1| _2| _3|
+---+---+---+
|  1|  2|  3|
|  2|  3|  4|
+---+---+---+

to create with some column names

>>> df1 = sc.parallelize([[1,2,3], [2,3,4]]).toDF(("a", "b", "c"))
>>> df1.show()
+---+---+---+
|  a|  b|  c|
+---+---+---+
|  1|  2|  3|
|  2|  3|  4|
+---+---+---+

In this way, no need to define schema too.Hope this is the simplest way

Summery

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

Also, Read