GC tuning for Spark

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

GC tuning for Spark

Kay Ousterhout
Hi all,

I'm finding that Java GC can be a major performance bottleneck when running
Spark at high (>50% or so) memory utilization.  What GC tuning have people
tried for Spark and how effective has it been?

Thanks!

Kay
Reply | Threaded
Open this post in threaded view
|

Re: GC tuning for Spark

Tathagata Das
There are a bunch of tricks noted in the Tuning
Guide<http://spark.incubator.apache.org/docs/latest/tuning.html#memory-tuning>.
You may have seen them already but I thought its still worth mentioning for
the records.

Besides those, if you are concerned about consistent latency (that is, low
variability in the job processing times), then using
concurrent-mark-and-sweep GC is recommended. Instead of big stop-the-world
GC pauses, there are many smaller pauses. This reduction in variability
comes at the cost of processing throughput though, so thats a tradeoff.

TD


On Thu, Jan 16, 2014 at 11:35 AM, Kay Ousterhout <[hidden email]>wrote:

> Hi all,
>
> I'm finding that Java GC can be a major performance bottleneck when running
> Spark at high (>50% or so) memory utilization.  What GC tuning have people
> tried for Spark and how effective has it been?
>
> Thanks!
>
> Kay
>
Reply | Threaded
Open this post in threaded view
|

Re: GC tuning for Spark

Mark Hamstra
And, of course, there are the bigger-hammer-than-GC-tuning approaches using
some combination of unchecked, off-heap and Tachyon.


On Thu, Jan 16, 2014 at 11:54 AM, Tathagata Das <[hidden email]
> wrote:

> There are a bunch of tricks noted in the Tuning
> Guide<
> http://spark.incubator.apache.org/docs/latest/tuning.html#memory-tuning>.
> You may have seen them already but I thought its still worth mentioning for
> the records.
>
> Besides those, if you are concerned about consistent latency (that is, low
> variability in the job processing times), then using
> concurrent-mark-and-sweep GC is recommended. Instead of big stop-the-world
> GC pauses, there are many smaller pauses. This reduction in variability
> comes at the cost of processing throughput though, so thats a tradeoff.
>
> TD
>
>
> On Thu, Jan 16, 2014 at 11:35 AM, Kay Ousterhout <[hidden email]
> >wrote:
>
> > Hi all,
> >
> > I'm finding that Java GC can be a major performance bottleneck when
> running
> > Spark at high (>50% or so) memory utilization.  What GC tuning have
> people
> > tried for Spark and how effective has it been?
> >
> > Thanks!
> >
> > Kay
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: GC tuning for Spark

Binh Nguyen
I think incorporating https://github.com/amplab/tachyon/wiki is a better
solution. I remembered Matei has said that it was in his plan but not sure
about the ETA for it to happen.


On Thu, Jan 16, 2014 at 12:30 PM, Mark Hamstra <[hidden email]>wrote:

> And, of course, there are the bigger-hammer-than-GC-tuning approaches using
> some combination of unchecked, off-heap and Tachyon.
>
>
> On Thu, Jan 16, 2014 at 11:54 AM, Tathagata Das <
> [hidden email]
> > wrote:
>
> > There are a bunch of tricks noted in the Tuning
> > Guide<
> > http://spark.incubator.apache.org/docs/latest/tuning.html#memory-tuning
> >.
> > You may have seen them already but I thought its still worth mentioning
> for
> > the records.
> >
> > Besides those, if you are concerned about consistent latency (that is,
> low
> > variability in the job processing times), then using
> > concurrent-mark-and-sweep GC is recommended. Instead of big
> stop-the-world
> > GC pauses, there are many smaller pauses. This reduction in variability
> > comes at the cost of processing throughput though, so thats a tradeoff.
> >
> > TD
> >
> >
> > On Thu, Jan 16, 2014 at 11:35 AM, Kay Ousterhout <[hidden email]
> > >wrote:
> >
> > > Hi all,
> > >
> > > I'm finding that Java GC can be a major performance bottleneck when
> > running
> > > Spark at high (>50% or so) memory utilization.  What GC tuning have
> > people
> > > tried for Spark and how effective has it been?
> > >
> > > Thanks!
> > >
> > > Kay
> > >
> >
>



--

Binh Nguyen