close

How to use JDBC source to write and read data in (Py)Spark?

Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about How to use JDBC source to write and read data in (Py)Spark in Python. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

How to use JDBC source to write and read data in (Py)Spark?

  1. How to use JDBC source to write and read data in (Py)Spark?

    Download mysql-connector-java driver and keep in spark jar folder,observe the bellow python code here writing data into “acotr1”,we have to create acotr1 table structure in mysql database

  2. JDBC source to write and read data in (Py)Spark

    Download mysql-connector-java driver and keep in spark jar folder,observe the bellow python code here writing data into “acotr1”,we have to create acotr1 table structure in mysql database

Method 1

Download mysql-connector-java driver and keep in spark jar folder,observe the bellow python code here writing data into “acotr1”,we have to create acotr1 table structure in mysql database

    spark = SparkSession.builder.appName("prasadad").master('local').config('spark.driver.extraClassPath','D:\spark-2.1.0-bin-hadoop2.7\jars\mysql-connector-java-5.1.41-bin.jar').getOrCreate()

    sc = spark.sparkContext

    from pyspark.sql import SQLContext

    sqlContext = SQLContext(sc)

    df = sqlContext.read.format("jdbc").options(url="jdbc:mysql://localhost:3306/sakila",driver="com.mysql.jdbc.Driver",dbtable="actor",user="root",password="****").load()

    mysql_url="jdbc:mysql://localhost:3306/sakila?user=root&password=****"

    df.write.jdbc(mysql_url,table="actor1",mode="append")

Method 2

Refer this link to download the jdbc for postgres and follow the steps to download jar file

“/home/anand/.ivy2/jars/org.postgresql_postgresql-42.1.1.jar”

If your spark version is 2

from pyspark.sql import SparkSession

spark = SparkSession.builder
        .appName("sparkanalysis")
        .config("spark.driver.extraClassPath",
         "/home/anand/.ivy2/jars/org.postgresql_postgresql42.1.1.jar")
        .getOrCreate()

//for localhost database//

pgDF = spark.read \
.format("jdbc") \
.option("url", "jdbc:postgresql:postgres") \
.option("dbtable", "public.user_emp_tab") \
.option("user", "postgres") \
.option("password", "[email protected]") \
.load()


print(pgDF)

pgDF.filter(pgDF["user_id"]>5).show()

save the file as python and run “python respectivefilename.py”

Conclusion

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

Also, Read