Closing a SparkSession stops the SparkContext

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Closing a SparkSession stops the SparkContext

Vinoo Ganesh

Hi All - 

   I’ve been digging into the code and looking into what appears to be a memory leak (https://jira.apache.org/jira/browse/SPARK-27337) and have noticed something kind of peculiar about the way closing a SparkSession is handled. Despite being marked as Closeable, closing/stopping a SparkSession simply stops the SparkContext. This changed happened as a result of one of the PRs addressing https://jira.apache.org/jira/browse/SPARK-15073 in https://github.com/apache/spark/pull/12873/files#diff-d91c284798f1c98bf03a31855e26d71cR596.

 

I’m trying to understand why this is the intended behavior – anyone have any knowledge of why this is the case?

 

Thanks,

Vinoo

Reply | Threaded
Open this post in threaded view
|

Re: Closing a SparkSession stops the SparkContext

Sean Owen-2
What are you expecting there ... that sounds correct? something else
needs to be closed?

On Tue, Apr 2, 2019 at 9:45 AM Vinoo Ganesh <[hidden email]> wrote:

>
> Hi All -
>
>    I’ve been digging into the code and looking into what appears to be a memory leak (https://jira.apache.org/jira/browse/SPARK-27337) and have noticed something kind of peculiar about the way closing a SparkSession is handled. Despite being marked as Closeable, closing/stopping a SparkSession simply stops the SparkContext. This changed happened as a result of one of the PRs addressing https://jira.apache.org/jira/browse/SPARK-15073 in https://github.com/apache/spark/pull/12873/files#diff-d91c284798f1c98bf03a31855e26d71cR596.
>
>
>
> I’m trying to understand why this is the intended behavior – anyone have any knowledge of why this is the case?
>
>
>
> Thanks,
>
> Vinoo

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Closing a SparkSession stops the SparkContext

Vinoo Ganesh
Hey Sean - Cool, maybe I'm misunderstanding the intent of clearing a session vs. stopping it.

The cause of the leak looks to be because of this line here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131. The ExecutionListenerBus that's added persists forever on the context's listener bus (the SparkContext ListenerBus has an ExecutionListenerBus). I'm trying to figure out the place that this cleanup should happen.  

With the current implementation, calling SparkSession.stop will clean up the ExecutionListenerBus (since the context itself is stopped), but it's unclear to me why terminating one session should terminate the JVM-global context. Possible my mental model is off here, but I would expect stopping a session to remove all traces of that session, while keeping the context alive, and stopping a context would, well, stop the context.

If stopping the session is expected to stop the context, what's the intended usage of clearing the active / default session?

Vinoo

On 4/2/19, 10:57, "Sean Owen" <[hidden email]> wrote:

    What are you expecting there ... that sounds correct? something else
    needs to be closed?
   
    On Tue, Apr 2, 2019 at 9:45 AM Vinoo Ganesh <[hidden email]> wrote:
    >
    > Hi All -
    >
    >    I’ve been digging into the code and looking into what appears to be a memory leak (https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D27337&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=JUsN7EzGimus0jYxyj47_xHYUDC6KnxieeUBfUKTefk&e=) and have noticed something kind of peculiar about the way closing a SparkSession is handled. Despite being marked as Closeable, closing/stopping a SparkSession simply stops the SparkContext. This changed happened as a result of one of the PRs addressing https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D15073&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=Nd9eBDH-FDdzEn_BVt2nZaNQn6fXA8EfVq5rKGztOUo&e= in https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_12873_files-23diff-2Dd91c284798f1c98bf03a31855e26d71cR596&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=RM9LrT3Yp2mf1BcbBf1o_m3bcNZdOjznrogBLzUzgeE&e=.
    >
    >
    >
    > I’m trying to understand why this is the intended behavior – anyone have any knowledge of why this is the case?
    >
    >
    >
    > Thanks,
    >
    > Vinoo
   


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Closing a SparkSession stops the SparkContext

Sean Owen-2
Yeah there's one global default session, but it's possible to create
others and set them as the thread's active session, to allow for
different configurations in the SparkSession within one app. I think
you're asking why closing one of them would effectively shut all of
them down by stopping the SparkContext. My best guess is simply, well,
that's how it works. You'd only call this, like SparkContext.stop(),
when you know the whole app is done and want to clean up. SparkSession
is a kind of wrapper on SparkContext and it wouldn't be great to make
users stop all the sessions and go find and stop the context.

If there is some per-SparkSession state that needs a cleanup, then
that's a good point, as I don't see a lifecycle method that means
"just close this session".
You're talking about SparkContext state though right, and there is
definitely just one SparkContext though. It can/should only be stopped
when the app is really done.

Is the point that each session is adding some state to the context and
doesn't have any mechanism for removing it?

On Tue, Apr 2, 2019 at 10:23 AM Vinoo Ganesh <[hidden email]> wrote:

>
> Hey Sean - Cool, maybe I'm misunderstanding the intent of clearing a session vs. stopping it.
>
> The cause of the leak looks to be because of this line here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131. The ExecutionListenerBus that's added persists forever on the context's listener bus (the SparkContext ListenerBus has an ExecutionListenerBus). I'm trying to figure out the place that this cleanup should happen.
>
> With the current implementation, calling SparkSession.stop will clean up the ExecutionListenerBus (since the context itself is stopped), but it's unclear to me why terminating one session should terminate the JVM-global context. Possible my mental model is off here, but I would expect stopping a session to remove all traces of that session, while keeping the context alive, and stopping a context would, well, stop the context.
>
> If stopping the session is expected to stop the context, what's the intended usage of clearing the active / default session?
>
> Vinoo
>
> On 4/2/19, 10:57, "Sean Owen" <[hidden email]> wrote:
>
>     What are you expecting there ... that sounds correct? something else
>     needs to be closed?
>
>     On Tue, Apr 2, 2019 at 9:45 AM Vinoo Ganesh <[hidden email]> wrote:
>     >
>     > Hi All -
>     >
>     >    I’ve been digging into the code and looking into what appears to be a memory leak (https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D27337&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=JUsN7EzGimus0jYxyj47_xHYUDC6KnxieeUBfUKTefk&e=) and have noticed something kind of peculiar about the way closing a SparkSession is handled. Despite being marked as Closeable, closing/stopping a SparkSession simply stops the SparkContext. This changed happened as a result of one of the PRs addressing https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D15073&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=Nd9eBDH-FDdzEn_BVt2nZaNQn6fXA8EfVq5rKGztOUo&e= in https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_12873_files-23diff-2Dd91c284798f1c98bf03a31855e26d71cR596&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=RM9LrT3Yp2mf1BcbBf1o_m3bcNZdOjznrogBLzUzgeE&e=.
>     >
>     >
>     >
>     > I’m trying to understand why this is the intended behavior – anyone have any knowledge of why this is the case?
>     >
>     >
>     >
>     > Thanks,
>     >
>     > Vinoo
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Closing a SparkSession stops the SparkContext

Ryan Blue
In reply to this post by Vinoo Ganesh
I think Vinoo is right about the intended behavior. If we support multiple sessions in one context, then stopping any one session shouldn't stop the shared context. The last session to be stopped should stop the context, but not any before that. We don't typically run multiple sessions in the same context so we haven't hit this, but it sounds reasonable.

On Tue, Apr 2, 2019 at 8:23 AM Vinoo Ganesh <[hidden email]> wrote:
Hey Sean - Cool, maybe I'm misunderstanding the intent of clearing a session vs. stopping it.

The cause of the leak looks to be because of this line here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131. The ExecutionListenerBus that's added persists forever on the context's listener bus (the SparkContext ListenerBus has an ExecutionListenerBus). I'm trying to figure out the place that this cleanup should happen. 

With the current implementation, calling SparkSession.stop will clean up the ExecutionListenerBus (since the context itself is stopped), but it's unclear to me why terminating one session should terminate the JVM-global context. Possible my mental model is off here, but I would expect stopping a session to remove all traces of that session, while keeping the context alive, and stopping a context would, well, stop the context.

If stopping the session is expected to stop the context, what's the intended usage of clearing the active / default session?

Vinoo

On 4/2/19, 10:57, "Sean Owen" <[hidden email]> wrote:

    What are you expecting there ... that sounds correct? something else
    needs to be closed?

    On Tue, Apr 2, 2019 at 9:45 AM Vinoo Ganesh <[hidden email]> wrote:
    >
    > Hi All -
    >
    >    I’ve been digging into the code and looking into what appears to be a memory leak (https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D27337&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=JUsN7EzGimus0jYxyj47_xHYUDC6KnxieeUBfUKTefk&e=) and have noticed something kind of peculiar about the way closing a SparkSession is handled. Despite being marked as Closeable, closing/stopping a SparkSession simply stops the SparkContext. This changed happened as a result of one of the PRs addressing https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D15073&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=Nd9eBDH-FDdzEn_BVt2nZaNQn6fXA8EfVq5rKGztOUo&e= in https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_12873_files-23diff-2Dd91c284798f1c98bf03a31855e26d71cR596&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=RM9LrT3Yp2mf1BcbBf1o_m3bcNZdOjznrogBLzUzgeE&e=.
    >
    >
    >
    > I’m trying to understand why this is the intended behavior – anyone have any knowledge of why this is the case?
    >
    >
    >
    > Thanks,
    >
    > Vinoo



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: Closing a SparkSession stops the SparkContext

Arun Mahadevan
I am not sure how would it cause a leak though. When a spark session or the underlying context is stopped it should clean up everything. The getOrCreate is supposed to return the active thread local or the global session. May be if you keep creating new sessions after explicitly clearing the default and the local sessions and keep leaking the sessions it could happen, but I don't think Sessions are intended to be used that way.



On Tue, 2 Apr 2019 at 08:45, Ryan Blue <[hidden email]> wrote:
I think Vinoo is right about the intended behavior. If we support multiple sessions in one context, then stopping any one session shouldn't stop the shared context. The last session to be stopped should stop the context, but not any before that. We don't typically run multiple sessions in the same context so we haven't hit this, but it sounds reasonable.

On Tue, Apr 2, 2019 at 8:23 AM Vinoo Ganesh <[hidden email]> wrote:
Hey Sean - Cool, maybe I'm misunderstanding the intent of clearing a session vs. stopping it.

The cause of the leak looks to be because of this line here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131. The ExecutionListenerBus that's added persists forever on the context's listener bus (the SparkContext ListenerBus has an ExecutionListenerBus). I'm trying to figure out the place that this cleanup should happen. 

With the current implementation, calling SparkSession.stop will clean up the ExecutionListenerBus (since the context itself is stopped), but it's unclear to me why terminating one session should terminate the JVM-global context. Possible my mental model is off here, but I would expect stopping a session to remove all traces of that session, while keeping the context alive, and stopping a context would, well, stop the context.

If stopping the session is expected to stop the context, what's the intended usage of clearing the active / default session?

Vinoo

On 4/2/19, 10:57, "Sean Owen" <[hidden email]> wrote:

    What are you expecting there ... that sounds correct? something else
    needs to be closed?

    On Tue, Apr 2, 2019 at 9:45 AM Vinoo Ganesh <[hidden email]> wrote:
    >
    > Hi All -
    >
    >    I’ve been digging into the code and looking into what appears to be a memory leak (https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D27337&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=JUsN7EzGimus0jYxyj47_xHYUDC6KnxieeUBfUKTefk&e=) and have noticed something kind of peculiar about the way closing a SparkSession is handled. Despite being marked as Closeable, closing/stopping a SparkSession simply stops the SparkContext. This changed happened as a result of one of the PRs addressing https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D15073&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=Nd9eBDH-FDdzEn_BVt2nZaNQn6fXA8EfVq5rKGztOUo&e= in https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_12873_files-23diff-2Dd91c284798f1c98bf03a31855e26d71cR596&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=RM9LrT3Yp2mf1BcbBf1o_m3bcNZdOjznrogBLzUzgeE&e=.
    >
    >
    >
    > I’m trying to understand why this is the intended behavior – anyone have any knowledge of why this is the case?
    >
    >
    >
    > Thanks,
    >
    > Vinoo



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: Closing a SparkSession stops the SparkContext

Vinoo Ganesh

// Merging threads

 

Thanks everyone for your thoughts. I’m very much in sync with Ryan here.

 

@Sean – To the point that Ryan made, it feels wrong that stopping a session force stops the global context. Building in the logic to only stop the context when the last session is stopped also feels like a solution, but the best way I can think about doing this involves storing the global list of every available SparkSession, which may be difficult.

 

@Arun – If the intention is not to be able to clear and create new sessions, then what specific is the intended use case of Sessions? https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html describes SparkSessions as time bounded interactions which implies that old ones should be clear-able an news ones create-able in lockstep without adverse effect?

 

From: Arun Mahadevan <[hidden email]>
Date: Tuesday, April 2, 2019 at 12:31
To: Ryan Blue <[hidden email]>
Cc: Vinoo Ganesh <[hidden email]>, Sean Owen <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Closing a SparkSession stops the SparkContext

 

I am not sure how would it cause a leak though. When a spark session or the underlying context is stopped it should clean up everything. The getOrCreate is supposed to return the active thread local or the global session. May be if you keep creating new sessions after explicitly clearing the default and the local sessions and keep leaking the sessions it could happen, but I don't think Sessions are intended to be used that way.

 

On Tue, 2 Apr 2019 at 08:45, Ryan Blue <[hidden email]> wrote:

I think Vinoo is right about the intended behavior. If we support multiple sessions in one context, then stopping any one session shouldn't stop the shared context. The last session to be stopped should stop the context, but not any before that. We don't typically run multiple sessions in the same context so we haven't hit this, but it sounds reasonable.

 

On 4/2/19, 11:44, "Sean Owen" <[hidden email]> wrote:

 

    Yeah there's one global default session, but it's possible to create

    others and set them as the thread's active session, to allow for

    different configurations in the SparkSession within one app. I think

    you're asking why closing one of them would effectively shut all of

    them down by stopping the SparkContext. My best guess is simply, well,

    that's how it works. You'd only call this, like SparkContext.stop(),

    when you know the whole app is done and want to clean up. SparkSession

    is a kind of wrapper on SparkContext and it wouldn't be great to make

    users stop all the sessions and go find and stop the context.

   

    If there is some per-SparkSession state that needs a cleanup, then

    that's a good point, as I don't see a lifecycle method that means

    "just close this session".

    You're talking about SparkContext state though right, and there is

    definitely just one SparkContext though. It can/should only be stopped

    when the app is really done.

   

    Is the point that each session is adding some state to the context and

    doesn't have any mechanism for removing it?

 

 

On Tue, Apr 2, 2019 at 8:23 AM Vinoo Ganesh <[hidden email]> wrote:

Hey Sean - Cool, maybe I'm misunderstanding the intent of clearing a session vs. stopping it.

The cause of the leak looks to be because of this line here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131 [github.com]. The ExecutionListenerBus that's added persists forever on the context's listener bus (the SparkContext ListenerBus has an ExecutionListenerBus). I'm trying to figure out the place that this cleanup should happen. 

With the current implementation, calling SparkSession.stop will clean up the ExecutionListenerBus (since the context itself is stopped), but it's unclear to me why terminating one session should terminate the JVM-global context. Possible my mental model is off here, but I would expect stopping a session to remove all traces of that session, while keeping the context alive, and stopping a context would, well, stop the context.

If stopping the session is expected to stop the context, what's the intended usage of clearing the active / default session?

Vinoo

On 4/2/19, 10:57, "Sean Owen" <[hidden email]> wrote:

    What are you expecting there ... that sounds correct? something else
    needs to be closed?

    On Tue, Apr 2, 2019 at 9:45 AM Vinoo Ganesh <[hidden email]> wrote:
    >
    > Hi All -
    >
    >    I’ve been digging into the code and looking into what appears to be a memory leak (https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D27337&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=JUsN7EzGimus0jYxyj47_xHYUDC6KnxieeUBfUKTefk&e=) and have noticed something kind of peculiar about the way closing a SparkSession is handled. Despite being marked as Closeable, closing/stopping a SparkSession simply stops the SparkContext. This changed happened as a result of one of the PRs addressing https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D15073&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=Nd9eBDH-FDdzEn_BVt2nZaNQn6fXA8EfVq5rKGztOUo&e= in https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_12873_files-23diff-2Dd91c284798f1c98bf03a31855e26d71cR596&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=RM9LrT3Yp2mf1BcbBf1o_m3bcNZdOjznrogBLzUzgeE&e=.
    >
    >
    >
    > I’m trying to understand why this is the intended behavior – anyone have any knowledge of why this is the case?
    >
    >
    >
    > Thanks,
    >
    > Vinoo



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]


 

--

Ryan Blue

Software Engineer

Netflix

Reply | Threaded
Open this post in threaded view
|

Re: Closing a SparkSession stops the SparkContext

Sean Owen-2
On Tue, Apr 2, 2019 at 12:23 PM Vinoo Ganesh <[hidden email]> wrote:
> @Sean – To the point that Ryan made, it feels wrong that stopping a session force stops the global context. Building in the logic to only stop the context when the last session is stopped also feels like a solution, but the best way I can think about doing this involves storing the global list of every available SparkSession, which may be difficult.

I tend to agree it would be more natural for the SparkSession to have
its own lifecycle 'stop' method that only stops/releases its own
resources. But is that the source of the problem here? if the state
you're trying to free is needed by the SparkContext, it won't help. If
it happens to be in the SparkContext but is state only needed by one
SparkSession and that there isn't any way to clean up now, that's a
compelling reason to change the API.  Is that the situation? The only
downside is making the user separately stop the SparkContext then.

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Closing a SparkSession stops the SparkContext

Vinoo Ganesh
In reply to this post by Vinoo Ganesh

Yeah, so I think there are 2 separate issues here:

 

  1. The coupling of the SparkSession + SparkContext in their current form seem unnatural
  2. The current memory leak, which I do believe is a case where the session is added onto the spark context, but is only needed by the session (but would appreciate a sanity check here). Meaning, it may make sense to investigate an API change.

 

Thoughts?

 

On 4/2/19, 15:13, "Sean Owen" <[hidden email]> wrote:

    > @Sean – To the point that Ryan made, it feels wrong that stopping a session force stops the global context. Building in the logic to only stop the context when the last session is stopped also feels like a solution, but the best way I can think about doing this involves storing the global list of every available SparkSession, which may be difficult.

 

    I tend to agree it would be more natural for the SparkSession to have

    its own lifecycle 'stop' method that only stops/releases its own

    resources. But is that the source of the problem here? if the state

    you're trying to free is needed by the SparkContext, it won't help. If

    it happens to be in the SparkContext but is state only needed by one

    SparkSession and that there isn't any way to clean up now, that's a

    compelling reason to change the API.  Is that the situation? The only

    downside is making the user separately stop the SparkContext then.

 

From: Vinoo Ganesh <[hidden email]>
Date: Tuesday, April 2, 2019 at 13:24
To: Arun Mahadevan <[hidden email]>, Ryan Blue <[hidden email]>
Cc: Sean Owen <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Closing a SparkSession stops the SparkContext

 

// Merging threads

 

Thanks everyone for your thoughts. I’m very much in sync with Ryan here.

 

@Sean – To the point that Ryan made, it feels wrong that stopping a session force stops the global context. Building in the logic to only stop the context when the last session is stopped also feels like a solution, but the best way I can think about doing this involves storing the global list of every available SparkSession, which may be difficult.

 

@Arun – If the intention is not to be able to clear and create new sessions, then what specific is the intended use case of Sessions? https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html [databricks.com] describes SparkSessions as time bounded interactions which implies that old ones should be clear-able an news ones create-able in lockstep without adverse effect?

 

From: Arun Mahadevan <[hidden email]>
Date: Tuesday, April 2, 2019 at 12:31
To: Ryan Blue <[hidden email]>
Cc: Vinoo Ganesh <[hidden email]>, Sean Owen <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Closing a SparkSession stops the SparkContext

 

I am not sure how would it cause a leak though. When a spark session or the underlying context is stopped it should clean up everything. The getOrCreate is supposed to return the active thread local or the global session. May be if you keep creating new sessions after explicitly clearing the default and the local sessions and keep leaking the sessions it could happen, but I don't think Sessions are intended to be used that way.

 

On Tue, 2 Apr 2019 at 08:45, Ryan Blue <[hidden email]> wrote:

I think Vinoo is right about the intended behavior. If we support multiple sessions in one context, then stopping any one session shouldn't stop the shared context. The last session to be stopped should stop the context, but not any before that. We don't typically run multiple sessions in the same context so we haven't hit this, but it sounds reasonable.

 

On 4/2/19, 11:44, "Sean Owen" <[hidden email]> wrote:

 

    Yeah there's one global default session, but it's possible to create

    others and set them as the thread's active session, to allow for

    different configurations in the SparkSession within one app. I think

    you're asking why closing one of them would effectively shut all of

    them down by stopping the SparkContext. My best guess is simply, well,

    that's how it works. You'd only call this, like SparkContext.stop(),

    when you know the whole app is done and want to clean up. SparkSession

    is a kind of wrapper on SparkContext and it wouldn't be great to make

    users stop all the sessions and go find and stop the context.

   

    If there is some per-SparkSession state that needs a cleanup, then

    that's a good point, as I don't see a lifecycle method that means

    "just close this session".

    You're talking about SparkContext state though right, and there is

    definitely just one SparkContext though. It can/should only be stopped

    when the app is really done.

   

    Is the point that each session is adding some state to the context and

    doesn't have any mechanism for removing it?

 

 

On Tue, Apr 2, 2019 at 8:23 AM Vinoo Ganesh <[hidden email]> wrote:

Hey Sean - Cool, maybe I'm misunderstanding the intent of clearing a session vs. stopping it.

The cause of the leak looks to be because of this line here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131 [github.com]. The ExecutionListenerBus that's added persists forever on the context's listener bus (the SparkContext ListenerBus has an ExecutionListenerBus). I'm trying to figure out the place that this cleanup should happen. 

With the current implementation, calling SparkSession.stop will clean up the ExecutionListenerBus (since the context itself is stopped), but it's unclear to me why terminating one session should terminate the JVM-global context. Possible my mental model is off here, but I would expect stopping a session to remove all traces of that session, while keeping the context alive, and stopping a context would, well, stop the context.

If stopping the session is expected to stop the context, what's the intended usage of clearing the active / default session?

Vinoo

On 4/2/19, 10:57, "Sean Owen" <[hidden email]> wrote:

    What are you expecting there ... that sounds correct? something else
    needs to be closed?

    On Tue, Apr 2, 2019 at 9:45 AM Vinoo Ganesh <[hidden email]> wrote:
    >
    > Hi All -
    >
    >    I’ve been digging into the code and looking into what appears to be a memory leak (https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D27337&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=JUsN7EzGimus0jYxyj47_xHYUDC6KnxieeUBfUKTefk&e=) and have noticed something kind of peculiar about the way closing a SparkSession is handled. Despite being marked as Closeable, closing/stopping a SparkSession simply stops the SparkContext. This changed happened as a result of one of the PRs addressing https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D15073&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=Nd9eBDH-FDdzEn_BVt2nZaNQn6fXA8EfVq5rKGztOUo&e= in https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_12873_files-23diff-2Dd91c284798f1c98bf03a31855e26d71cR596&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=RM9LrT3Yp2mf1BcbBf1o_m3bcNZdOjznrogBLzUzgeE&e=.
    >
    >
    >
    > I’m trying to understand why this is the intended behavior – anyone have any knowledge of why this is the case?
    >
    >
    >
    > Thanks,
    >
    > Vinoo



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]


 

--

Ryan Blue

Software Engineer

Netflix

Reply | Threaded
Open this post in threaded view
|

Re: Closing a SparkSession stops the SparkContext

Ryan Blue
For #1, do we agree on the behavior? I think that closing a SparkSession should not close the SparkContext unless it is the only session. Evidently, that's not what happens and I consider the current the current behavior a bug.

For more context, we're working on the new catalog APIs and how to guarantee consistent operations. Self-joining a table, for example, should use the same version of the table for both scans, and that state should be specific to a session, not global. These plans assume that SparkSession represents a session of interactions, along with a reasonable life-cycle. If that life-cycle includes closing all sessions when you close any session, then we can't really use sessions for this.

rb

On Wed, Apr 3, 2019 at 9:35 AM Vinoo Ganesh <[hidden email]> wrote:

Yeah, so I think there are 2 separate issues here:

 

  1. The coupling of the SparkSession + SparkContext in their current form seem unnatural
  2. The current memory leak, which I do believe is a case where the session is added onto the spark context, but is only needed by the session (but would appreciate a sanity check here). Meaning, it may make sense to investigate an API change.

 

Thoughts?

 

On 4/2/19, 15:13, "Sean Owen" <[hidden email]> wrote:

    > @Sean – To the point that Ryan made, it feels wrong that stopping a session force stops the global context. Building in the logic to only stop the context when the last session is stopped also feels like a solution, but the best way I can think about doing this involves storing the global list of every available SparkSession, which may be difficult.

 

    I tend to agree it would be more natural for the SparkSession to have

    its own lifecycle 'stop' method that only stops/releases its own

    resources. But is that the source of the problem here? if the state

    you're trying to free is needed by the SparkContext, it won't help. If

    it happens to be in the SparkContext but is state only needed by one

    SparkSession and that there isn't any way to clean up now, that's a

    compelling reason to change the API.  Is that the situation? The only

    downside is making the user separately stop the SparkContext then.

 

From: Vinoo Ganesh <[hidden email]>
Date: Tuesday, April 2, 2019 at 13:24
To: Arun Mahadevan <[hidden email]>, Ryan Blue <[hidden email]>
Cc: Sean Owen <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Closing a SparkSession stops the SparkContext

 

// Merging threads

 

Thanks everyone for your thoughts. I’m very much in sync with Ryan here.

 

@Sean – To the point that Ryan made, it feels wrong that stopping a session force stops the global context. Building in the logic to only stop the context when the last session is stopped also feels like a solution, but the best way I can think about doing this involves storing the global list of every available SparkSession, which may be difficult.

 

@Arun – If the intention is not to be able to clear and create new sessions, then what specific is the intended use case of Sessions? https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html [databricks.com] describes SparkSessions as time bounded interactions which implies that old ones should be clear-able an news ones create-able in lockstep without adverse effect?

 

From: Arun Mahadevan <[hidden email]>
Date: Tuesday, April 2, 2019 at 12:31
To: Ryan Blue <[hidden email]>
Cc: Vinoo Ganesh <[hidden email]>, Sean Owen <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Closing a SparkSession stops the SparkContext

 

I am not sure how would it cause a leak though. When a spark session or the underlying context is stopped it should clean up everything. The getOrCreate is supposed to return the active thread local or the global session. May be if you keep creating new sessions after explicitly clearing the default and the local sessions and keep leaking the sessions it could happen, but I don't think Sessions are intended to be used that way.

 

On Tue, 2 Apr 2019 at 08:45, Ryan Blue <[hidden email]> wrote:

I think Vinoo is right about the intended behavior. If we support multiple sessions in one context, then stopping any one session shouldn't stop the shared context. The last session to be stopped should stop the context, but not any before that. We don't typically run multiple sessions in the same context so we haven't hit this, but it sounds reasonable.

 

On 4/2/19, 11:44, "Sean Owen" <[hidden email]> wrote:

 

    Yeah there's one global default session, but it's possible to create

    others and set them as the thread's active session, to allow for

    different configurations in the SparkSession within one app. I think

    you're asking why closing one of them would effectively shut all of

    them down by stopping the SparkContext. My best guess is simply, well,

    that's how it works. You'd only call this, like SparkContext.stop(),

    when you know the whole app is done and want to clean up. SparkSession

    is a kind of wrapper on SparkContext and it wouldn't be great to make

    users stop all the sessions and go find and stop the context.

   

    If there is some per-SparkSession state that needs a cleanup, then

    that's a good point, as I don't see a lifecycle method that means

    "just close this session".

    You're talking about SparkContext state though right, and there is

    definitely just one SparkContext though. It can/should only be stopped

    when the app is really done.

   

    Is the point that each session is adding some state to the context and

    doesn't have any mechanism for removing it?

 

 

On Tue, Apr 2, 2019 at 8:23 AM Vinoo Ganesh <[hidden email]> wrote:

Hey Sean - Cool, maybe I'm misunderstanding the intent of clearing a session vs. stopping it.

The cause of the leak looks to be because of this line here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131 [github.com]. The ExecutionListenerBus that's added persists forever on the context's listener bus (the SparkContext ListenerBus has an ExecutionListenerBus). I'm trying to figure out the place that this cleanup should happen. 

With the current implementation, calling SparkSession.stop will clean up the ExecutionListenerBus (since the context itself is stopped), but it's unclear to me why terminating one session should terminate the JVM-global context. Possible my mental model is off here, but I would expect stopping a session to remove all traces of that session, while keeping the context alive, and stopping a context would, well, stop the context.

If stopping the session is expected to stop the context, what's the intended usage of clearing the active / default session?

Vinoo

On 4/2/19, 10:57, "Sean Owen" <[hidden email]> wrote:

    What are you expecting there ... that sounds correct? something else
    needs to be closed?

    On Tue, Apr 2, 2019 at 9:45 AM Vinoo Ganesh <[hidden email]> wrote:
    >
    > Hi All -
    >
    >    I’ve been digging into the code and looking into what appears to be a memory leak (https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D27337&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=JUsN7EzGimus0jYxyj47_xHYUDC6KnxieeUBfUKTefk&e=) and have noticed something kind of peculiar about the way closing a SparkSession is handled. Despite being marked as Closeable, closing/stopping a SparkSession simply stops the SparkContext. This changed happened as a result of one of the PRs addressing https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D15073&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=Nd9eBDH-FDdzEn_BVt2nZaNQn6fXA8EfVq5rKGztOUo&e= in https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_12873_files-23diff-2Dd91c284798f1c98bf03a31855e26d71cR596&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=RM9LrT3Yp2mf1BcbBf1o_m3bcNZdOjznrogBLzUzgeE&e=.
    >
    >
    >
    > I’m trying to understand why this is the intended behavior – anyone have any knowledge of why this is the case?
    >
    >
    >
    > Thanks,
    >
    > Vinoo



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]


 

--

Ryan Blue

Software Engineer

Netflix



--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: Closing a SparkSession stops the SparkContext

Vinoo Ganesh

Picking up this email thread again around point #1 below, filed https://issues.apache.org/jira/browse/SPARK-27958 and put up a PR (still have to write tests) https://github.com/apache/spark/pull/24807 just to begin the conversation.

 

 

From: Ryan Blue <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Wednesday, April 3, 2019 at 13:14
To: Vinoo Ganesh <[hidden email]>
Cc: Sean Owen <[hidden email]>, Arun Mahadevan <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Closing a SparkSession stops the SparkContext

 

For #1, do we agree on the behavior? I think that closing a SparkSession should not close the SparkContext unless it is the only session. Evidently, that's not what happens and I consider the current the current behavior a bug.

 

For more context, we're working on the new catalog APIs and how to guarantee consistent operations. Self-joining a table, for example, should use the same version of the table for both scans, and that state should be specific to a session, not global. These plans assume that SparkSession represents a session of interactions, along with a reasonable life-cycle. If that life-cycle includes closing all sessions when you close any session, then we can't really use sessions for this.

 

rb

 

On Wed, Apr 3, 2019 at 9:35 AM Vinoo Ganesh <[hidden email]> wrote:

Yeah, so I think there are 2 separate issues here:

 

  1. The coupling of the SparkSession + SparkContext in their current form seem unnatural
  2. The current memory leak, which I do believe is a case where the session is added onto the spark context, but is only needed by the session (but would appreciate a sanity check here). Meaning, it may make sense to investigate an API change.

 

Thoughts?

 

On 4/2/19, 15:13, "Sean Owen" <[hidden email]> wrote:

    > @Sean – To the point that Ryan made, it feels wrong that stopping a session force stops the global context. Building in the logic to only stop the context when the last session is stopped also feels like a solution, but the best way I can think about doing this involves storing the global list of every available SparkSession, which may be difficult.

 

    I tend to agree it would be more natural for the SparkSession to have

    its own lifecycle 'stop' method that only stops/releases its own

    resources. But is that the source of the problem here? if the state

    you're trying to free is needed by the SparkContext, it won't help. If

    it happens to be in the SparkContext but is state only needed by one

    SparkSession and that there isn't any way to clean up now, that's a

    compelling reason to change the API.  Is that the situation? The only

    downside is making the user separately stop the SparkContext then.

 

From: Vinoo Ganesh <[hidden email]>
Date: Tuesday, April 2, 2019 at 13:24
To: Arun Mahadevan <[hidden email]>, Ryan Blue <[hidden email]>
Cc: Sean Owen <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Closing a SparkSession stops the SparkContext

 

// Merging threads

 

Thanks everyone for your thoughts. I’m very much in sync with Ryan here.

 

@Sean – To the point that Ryan made, it feels wrong that stopping a session force stops the global context. Building in the logic to only stop the context when the last session is stopped also feels like a solution, but the best way I can think about doing this involves storing the global list of every available SparkSession, which may be difficult.

 

@Arun – If the intention is not to be able to clear and create new sessions, then what specific is the intended use case of Sessions? https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html [databricks.com] describes SparkSessions as time bounded interactions which implies that old ones should be clear-able an news ones create-able in lockstep without adverse effect?

 

From: Arun Mahadevan <[hidden email]>
Date: Tuesday, April 2, 2019 at 12:31
To: Ryan Blue <[hidden email]>
Cc: Vinoo Ganesh <[hidden email]>, Sean Owen <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Closing a SparkSession stops the SparkContext

 

I am not sure how would it cause a leak though. When a spark session or the underlying context is stopped it should clean up everything. The getOrCreate is supposed to return the active thread local or the global session. May be if you keep creating new sessions after explicitly clearing the default and the local sessions and keep leaking the sessions it could happen, but I don't think Sessions are intended to be used that way.

 

On Tue, 2 Apr 2019 at 08:45, Ryan Blue <[hidden email]> wrote:

I think Vinoo is right about the intended behavior. If we support multiple sessions in one context, then stopping any one session shouldn't stop the shared context. The last session to be stopped should stop the context, but not any before that. We don't typically run multiple sessions in the same context so we haven't hit this, but it sounds reasonable.

 

On 4/2/19, 11:44, "Sean Owen" <[hidden email]> wrote:

 

    Yeah there's one global default session, but it's possible to create

    others and set them as the thread's active session, to allow for

    different configurations in the SparkSession within one app. I think

    you're asking why closing one of them would effectively shut all of

    them down by stopping the SparkContext. My best guess is simply, well,

    that's how it works. You'd only call this, like SparkContext.stop(),

    when you know the whole app is done and want to clean up. SparkSession

    is a kind of wrapper on SparkContext and it wouldn't be great to make

    users stop all the sessions and go find and stop the context.

   

    If there is some per-SparkSession state that needs a cleanup, then

    that's a good point, as I don't see a lifecycle method that means

    "just close this session".

    You're talking about SparkContext state though right, and there is

    definitely just one SparkContext though. It can/should only be stopped

    when the app is really done.

   

    Is the point that each session is adding some state to the context and

    doesn't have any mechanism for removing it?

 

 

On Tue, Apr 2, 2019 at 8:23 AM Vinoo Ganesh <[hidden email]> wrote:

Hey Sean - Cool, maybe I'm misunderstanding the intent of clearing a session vs. stopping it.

The cause of the leak looks to be because of this line here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131 [github.com]. The ExecutionListenerBus that's added persists forever on the context's listener bus (the SparkContext ListenerBus has an ExecutionListenerBus). I'm trying to figure out the place that this cleanup should happen. 

With the current implementation, calling SparkSession.stop will clean up the ExecutionListenerBus (since the context itself is stopped), but it's unclear to me why terminating one session should terminate the JVM-global context. Possible my mental model is off here, but I would expect stopping a session to remove all traces of that session, while keeping the context alive, and stopping a context would, well, stop the context.

If stopping the session is expected to stop the context, what's the intended usage of clearing the active / default session?

Vinoo

On 4/2/19, 10:57, "Sean Owen" <[hidden email]> wrote:

    What are you expecting there ... that sounds correct? something else
    needs to be closed?

    On Tue, Apr 2, 2019 at 9:45 AM Vinoo Ganesh <[hidden email]> wrote:
    >
    > Hi All -
    >
    >    I’ve been digging into the code and looking into what appears to be a memory leak (https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D27337&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=JUsN7EzGimus0jYxyj47_xHYUDC6KnxieeUBfUKTefk&e=) and have noticed something kind of peculiar about the way closing a SparkSession is handled. Despite being marked as Closeable, closing/stopping a SparkSession simply stops the SparkContext. This changed happened as a result of one of the PRs addressing https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_SPARK-2D15073&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=Nd9eBDH-FDdzEn_BVt2nZaNQn6fXA8EfVq5rKGztOUo&e= in https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_12873_files-23diff-2Dd91c284798f1c98bf03a31855e26d71cR596&d=DwIFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=TjtXLhnSM5M_aKQlD2NFU2wRnXPvtrUbRm-t84gBNlY&s=RM9LrT3Yp2mf1BcbBf1o_m3bcNZdOjznrogBLzUzgeE&e=.
    >
    >
    >
    > I’m trying to understand why this is the intended behavior – anyone have any knowledge of why this is the case?
    >
    >
    >
    > Thanks,
    >
    > Vinoo



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]


 

--

Ryan Blue

Software Engineer

Netflix


 

--

Ryan Blue

Software Engineer

Netflix