Trigger full GC during executor idle time?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Trigger full GC during executor idle time?

Sean Owen-2
https://github.com/apache/spark/pull/23401

Interesting PR; I thought it was not worthwhile until I saw a paper
claiming this can speed things up to the tune of 2-6%. Has anyone
considered this before?

Sean

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Trigger full GC during executor idle time?

rxin
Not sure how reputable or representative that paper is...

On Mon, Dec 31, 2018 at 10:57 AM Sean Owen <[hidden email]> wrote:
https://github.com/apache/spark/pull/23401

Interesting PR; I thought it was not worthwhile until I saw a paper
claiming this can speed things up to the tune of 2-6%. Has anyone
considered this before?

Sean

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Trigger full GC during executor idle time?

Ryan Blue
After a quick look, I don't think that the paper's evaluation is very thorough. I don't see where it discusses what the PageRank implementation is doing in terms of object allocation or whether data is cached between iterations (looks like it probably isn't, based on Table III). It also doesn't address how this would interact with spark.memory.fraction. I think it would be a problem to set this threshold lower than spark.memory.fraction. And it doesn't say whether this is static or dynamic allocation.

My impression is that this is obviously a good idea for some allocation-heavy iterative workloads, but it is unclear whether it would help generally:

* An empty executor may delay starting tasks because of the optimistic GC
* Full GC instead of incremental may not be needed and could increase starting delay
* 1-core executors will always GC between tasks
* Spark-managed memory may cause long GC pauses that don't recover much space
* Dynamic allocation probably eliminates most of the benefit because of executor turn-over

rb

On Mon, Dec 31, 2018 at 11:01 AM Reynold Xin <[hidden email]> wrote:
Not sure how reputable or representative that paper is...

On Mon, Dec 31, 2018 at 10:57 AM Sean Owen <[hidden email]> wrote:
https://github.com/apache/spark/pull/23401

Interesting PR; I thought it was not worthwhile until I saw a paper
claiming this can speed things up to the tune of 2-6%. Has anyone
considered this before?

Sean

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: Trigger full GC during executor idle time?

Holden Karau
Maybe it would make sense to loop in the paper authors? I imagine they might have more information than ended up in the paper.

On Mon, Dec 31, 2018 at 2:10 PM Ryan Blue <[hidden email]> wrote:
After a quick look, I don't think that the paper's evaluation is very thorough. I don't see where it discusses what the PageRank implementation is doing in terms of object allocation or whether data is cached between iterations (looks like it probably isn't, based on Table III). It also doesn't address how this would interact with spark.memory.fraction. I think it would be a problem to set this threshold lower than spark.memory.fraction. And it doesn't say whether this is static or dynamic allocation.

My impression is that this is obviously a good idea for some allocation-heavy iterative workloads, but it is unclear whether it would help generally:

* An empty executor may delay starting tasks because of the optimistic GC
* Full GC instead of incremental may not be needed and could increase starting delay
* 1-core executors will always GC between tasks
* Spark-managed memory may cause long GC pauses that don't recover much space
* Dynamic allocation probably eliminates most of the benefit because of executor turn-over

rb

On Mon, Dec 31, 2018 at 11:01 AM Reynold Xin <[hidden email]> wrote:
Not sure how reputable or representative that paper is...

On Mon, Dec 31, 2018 at 10:57 AM Sean Owen <[hidden email]> wrote:
https://github.com/apache/spark/pull/23401

Interesting PR; I thought it was not worthwhile until I saw a paper
claiming this can speed things up to the tune of 2-6%. Has anyone
considered this before?

Sean

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--
Ryan Blue
Software Engineer
Netflix


--
Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 
Reply | Threaded
Open this post in threaded view
|

Re: Trigger full GC during executor idle time?

Mark Hamstra
Without addressing whether the change is beneficial or not, I will note that the logic in the paper and the PR's description is incorrect: "During execution, some executor nodes finish the tasks assigned to them early and wait for the entire stage to complete before more tasks are assigned to them, while other executor nodes take longer to finish." That is simply not true -- or more generously, is only sort of true in some circumstances where only a single Job is executing on the cluster. Less generously, there is no coordination between Executors. They simply receive Tasks from the DAGScheduler. When an Executor has idle resources, it informs the DAGScheduler, and it is the DAGScheduler that knows whether there is more work ready for the Executor. Perhaps the DAGScheduler should be sending a message to the Executor is it knows that there isn't more work for the Executor to do, but I am really dubious about Executors on their own deciding with their limited knowledge that they are going to take a GC break unless they really need to. 

On Mon, Dec 31, 2018 at 4:13 PM Holden Karau <[hidden email]> wrote:
Maybe it would make sense to loop in the paper authors? I imagine they might have more information than ended up in the paper.

On Mon, Dec 31, 2018 at 2:10 PM Ryan Blue <[hidden email]> wrote:
After a quick look, I don't think that the paper's evaluation is very thorough. I don't see where it discusses what the PageRank implementation is doing in terms of object allocation or whether data is cached between iterations (looks like it probably isn't, based on Table III). It also doesn't address how this would interact with spark.memory.fraction. I think it would be a problem to set this threshold lower than spark.memory.fraction. And it doesn't say whether this is static or dynamic allocation.

My impression is that this is obviously a good idea for some allocation-heavy iterative workloads, but it is unclear whether it would help generally:

* An empty executor may delay starting tasks because of the optimistic GC
* Full GC instead of incremental may not be needed and could increase starting delay
* 1-core executors will always GC between tasks
* Spark-managed memory may cause long GC pauses that don't recover much space
* Dynamic allocation probably eliminates most of the benefit because of executor turn-over

rb

On Mon, Dec 31, 2018 at 11:01 AM Reynold Xin <[hidden email]> wrote:
Not sure how reputable or representative that paper is...

On Mon, Dec 31, 2018 at 10:57 AM Sean Owen <[hidden email]> wrote:
https://github.com/apache/spark/pull/23401

Interesting PR; I thought it was not worthwhile until I saw a paper
claiming this can speed things up to the tune of 2-6%. Has anyone
considered this before?

Sean

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--
Ryan Blue
Software Engineer
Netflix


--
Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9