[FYI] Removing `spark-3.1.0-bin-hadoop2.7-hive1.2.tgz` from Apache Spark 3.1 distribution

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[FYI] Removing `spark-3.1.0-bin-hadoop2.7-hive1.2.tgz` from Apache Spark 3.1 distribution

Dongjoon Hyun-2
Hi, All.

Since Apache Spark 3.0.0, Apache Hive 2.3.7 is the default
Hive execution library. The forked Hive 1.2.1 library is not
recommended because it's not maintained properly.

In Apache Spark 3.1 on December 2020, we are going to
remove it from our official distribution.

    https://github.com/apache/spark/pull/29856
    SPARK-32981 Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution

Of course, the users still can build it from the source because the profile `hive-1.2` is still alive.

Please let us know if you are going to build with the forked unofficial Hive 1.2.1 library still in Apache Spark 3.1. We want to listen to your pain-points before moving forward in this area. Eventually we will remove Hive 1.2 as a last piece of migration to Hive 2.3/Hadoop3/Java11+.

Bests,
Dongjoon.
Reply | Threaded
Open this post in threaded view
|

Re: [FYI] Removing `spark-3.1.0-bin-hadoop2.7-hive1.2.tgz` from Apache Spark 3.1 distribution

Koert Kuipers
i am a little confused about this. i assumed spark would no longer make a distribution with hive 1.x, but the hive-1.2 profile remains.

yet i see the hive-1.2 profile has been removed from pom.xml?

On Wed, Sep 23, 2020 at 6:58 PM Dongjoon Hyun <[hidden email]> wrote:
Hi, All.

Since Apache Spark 3.0.0, Apache Hive 2.3.7 is the default
Hive execution library. The forked Hive 1.2.1 library is not
recommended because it's not maintained properly.

In Apache Spark 3.1 on December 2020, we are going to
remove it from our official distribution.

    https://github.com/apache/spark/pull/29856
    SPARK-32981 Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution

Of course, the users still can build it from the source because the profile `hive-1.2` is still alive.

Please let us know if you are going to build with the forked unofficial Hive 1.2.1 library still in Apache Spark 3.1. We want to listen to your pain-points before moving forward in this area. Eventually we will remove Hive 1.2 as a last piece of migration to Hive 2.3/Hadoop3/Java11+.

Bests,
Dongjoon.
Reply | Threaded
Open this post in threaded view
|

Re: [FYI] Removing `spark-3.1.0-bin-hadoop2.7-hive1.2.tgz` from Apache Spark 3.1 distribution

Dongjoon Hyun-2

Bests,
Dongjoon.

On Wed, Oct 7, 2020 at 1:04 PM Koert Kuipers <[hidden email]> wrote:
i am a little confused about this. i assumed spark would no longer make a distribution with hive 1.x, but the hive-1.2 profile remains.

yet i see the hive-1.2 profile has been removed from pom.xml?

On Wed, Sep 23, 2020 at 6:58 PM Dongjoon Hyun <[hidden email]> wrote:
Hi, All.

Since Apache Spark 3.0.0, Apache Hive 2.3.7 is the default
Hive execution library. The forked Hive 1.2.1 library is not
recommended because it's not maintained properly.

In Apache Spark 3.1 on December 2020, we are going to
remove it from our official distribution.

    https://github.com/apache/spark/pull/29856
    SPARK-32981 Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution

Of course, the users still can build it from the source because the profile `hive-1.2` is still alive.

Please let us know if you are going to build with the forked unofficial Hive 1.2.1 library still in Apache Spark 3.1. We want to listen to your pain-points before moving forward in this area. Eventually we will remove Hive 1.2 as a last piece of migration to Hive 2.3/Hadoop3/Java11+.

Bests,
Dongjoon.