Hive Bucketing Support

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Hive Bucketing Support

Chris Martin
Hi All,


first off apologies if this is not the correct place to ask this!

I've been following SPARK-19256 (Hive Bucketing Support) with interest for some time now as we do a relatively large amount of our data processing in Spark but use Hive for business analytics.  As a result we end up writing a non-trivial amount of data out twice; once in parquet optimized for Spark and once in once in orc optimized for Hive!  The hope is that SPARK-19256 will put an end to this.

I've noticed that there a PR (https://github.com/apache/spark/pull/19001) that's been open for almost a year now, with the last comment being over a month ago.  Does anyone know if I should remain hopeful that this support will be added in the near future or is it one of those things that's realistically going to be some distance off.

thanks,

Chris



Reply | Threaded
Open this post in threaded view
|

Re: Hive Bucketing Support

Abhijeet Kumar
I would ask my queries here.

Thanks,
Abhijeet Kumar

On 07-Jun-2018, at 1:03 AM, Chris Martin <[hidden email]> wrote:

Hi All,


first off apologies if this is not the correct place to ask this!

I've been following SPARK-19256 (Hive Bucketing Support) with interest for some time now as we do a relatively large amount of our data processing in Spark but use Hive for business analytics.  As a result we end up writing a non-trivial amount of data out twice; once in parquet optimized for Spark and once in once in orc optimized for Hive!  The hope is that SPARK-19256 will put an end to this.

I've noticed that there a PR (https://github.com/apache/spark/pull/19001) that's been open for almost a year now, with the last comment being over a month ago.  Does anyone know if I should remain hopeful that this support will be added in the near future or is it one of those things that's realistically going to be some distance off.

thanks,

Chris