close

How can I tear down a SparkSession and create a new one within one application?

Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about How can I tear down a SparkSession and create a new one within one application in Python. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

How can I tear down a SparkSession and create a new one within one application?

  1. How can I tear down a SparkSession and create a new one within one application?

    I discovered that the SparkSession class in the source code contains the following __init__ (I've removed irrelevant lines of code from display here):

  2. tear down a SparkSession and create a new one within one application

    I discovered that the SparkSession class in the source code contains the following __init__ (I've removed irrelevant lines of code from display here):

Method 1

Here is a workaround, but not a solution:

I discovered that the SparkSession class in the source code contains the following __init__ (I’ve removed irrelevant lines of code from display here):

_instantiatedContext = None

def __init__(self, sparkContext, jsparkSession=None):
    self._sc = sparkContext
    if SparkSession._instantiatedContext is None:
        SparkSession._instantiatedContext = self

Therefore, I can workaround my problem by setting the _instantiatedContext attribute of the session to None after calling session.stop(). When the next module executes, it calls getOrCreate() and does not find the previous _instantiatedContext, so it assigns a new sparkContext.

This isn’t a very satisfying solution but it serves as a workaround to meet my current needs. I’m unsure of whether or not this entire approach of starting independent sessions is anti-pattern or just unusual.

Method 2

Why would you not pass in the same spark session instance into the multiple stages of your pipeline? You could use a builder pattern. Sounds to me like you are collecting result sets at the end each stage and then passing that data into the next stage. Consider leaving the data in the cluster in the same session, and passing the session reference and result reference from stage to stage, until your application is complete.

In other words, put the

session = SparkSession.builder...

…in your main.

Summery

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

Also, Read