-Phadoop-provided still includes hadoop jars

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

-Phadoop-provided still includes hadoop jars

Kimahriman
When I try to build a distribution with either -Phive or -Phadoop-cloud along
with -Phadoop-provided, I still end up with hadoop jars in the distribution.

Specifically, with -Phive and -Phadoop-provided, you end up with
hadoop-annotations, hadoop-auth, and hadoop-common included in the Spark
jars, and with -Phadoop-cloud and -Phadoop-provided, you end up with
hadoop-annotations, as well as the hadoop-{aws,azure,openstack} jars. Is
this supposed to be the case or is there something I'm doing wrong? I just
want the spark-hive and spark-hadoop-cloud jars without the hadoop
dependencies, and right now I just have to delete the hadoop jars after the
fact.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: -Phadoop-provided still includes hadoop jars

Sean Owen-2
I don't have a good answer, Steve may know more, but from looking at dependency:tree, it looks mostly like it's hadoop-common that's at issue. Without -Phive it remains 'provided' in the assembly/ module, but -Phive causes it to come back in. Either there's some good reason for that, or, maybe we need to explicitly manage the scope of hadoop-common along with everything else Hadoop, even though Spark doesn't reference it directly.

On Mon, Oct 12, 2020 at 12:38 PM Kimahriman <[hidden email]> wrote:
When I try to build a distribution with either -Phive or -Phadoop-cloud along
with -Phadoop-provided, I still end up with hadoop jars in the distribution.

Specifically, with -Phive and -Phadoop-provided, you end up with
hadoop-annotations, hadoop-auth, and hadoop-common included in the Spark
jars, and with -Phadoop-cloud and -Phadoop-provided, you end up with
hadoop-annotations, as well as the hadoop-{aws,azure,openstack} jars. Is
this supposed to be the case or is there something I'm doing wrong? I just
want the spark-hive and spark-hadoop-cloud jars without the hadoop
dependencies, and right now I just have to delete the hadoop jars after the
fact.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: -Phadoop-provided still includes hadoop jars

Steve Loughran-2


On Mon, 12 Oct 2020 at 19:06, Sean Owen <[hidden email]> wrote:
I don't have a good answer, Steve may know more, but from looking at dependency:tree, it looks mostly like it's hadoop-common that's at issue. Without -Phive it remains 'provided' in the assembly/ module, but -Phive causes it to come back in. Either there's some good reason for that, or, maybe we need to explicitly manage the scope of hadoop-common along with everything else Hadoop, even though Spark doesn't reference it directly.
'

sorry, missed this. 

Yes, they should be scoped so that hadoop-provided leaves them out. Open a JIRA, and point me at it and I'll do my best.

The artifacts should just go into the hadoop-provided scope, shouldn't they?
 
On Mon, Oct 12, 2020 at 12:38 PM Kimahriman <[hidden email]> wrote:
When I try to build a distribution with either -Phive or -Phadoop-cloud along
with -Phadoop-provided, I still end up with hadoop jars in the distribution.

Specifically, with -Phive and -Phadoop-provided, you end up with
hadoop-annotations, hadoop-auth, and hadoop-common included in the Spark
jars, and with -Phadoop-cloud and -Phadoop-provided, you end up with
hadoop-annotations, as well as the hadoop-{aws,azure,openstack} jars. Is
this supposed to be the case or is there something I'm doing wrong? I just
want the spark-hive and spark-hadoop-cloud jars without the hadoop
dependencies, and right now I just have to delete the hadoop jars after the
fact.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]