Spark 2.4.4 with which version of Hadoop?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark 2.4.4 with which version of Hadoop?

JeffK
Hi,

We've been considering using the download package Spark 2.4.4 that's
pre-built for Hadoop 2.7 with Hadoop 2.7.7.

When used with Spark, Hadoop 2.7 is often quoted as the most stable.

However, Hadoop 2.7.7 is End Of Life. The most recent Hadoop vulnerabilities
have only been fixed in versions 2.8.5 and above.

We've searched the Spark user forum and have also been following discussions
on the development forum and it's still unclear as which version of Hadoop
should be used. Discussions about Spark 3.0.0 currently want to leave Hadoop
2.7 as the default, when there are known vulnerabilities this is a concern.

What versions of Hadoop 2.X is supported, which should we be using?

Thanks

Jeff



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Spark 2.4.4 with which version of Hadoop?

Sean Owen-2
My moderately informed take is that the "Hadoop 2.7" build is really a
"Hadoop 2.x" build and AFAIK should work with 2.8, 2.9, but, I
certainly haven't tested it nor have the PR builders. Just use the
"Hadoop provided" build on your env. Of course, you might well want to
use Hadoop 3.x (3.2.x specifically) with Spark 3, which is tested.

On Wed, Dec 11, 2019 at 11:43 AM JeffK <[hidden email]> wrote:

>
> Hi,
>
> We've been considering using the download package Spark 2.4.4 that's
> pre-built for Hadoop 2.7 with Hadoop 2.7.7.
>
> When used with Spark, Hadoop 2.7 is often quoted as the most stable.
>
> However, Hadoop 2.7.7 is End Of Life. The most recent Hadoop vulnerabilities
> have only been fixed in versions 2.8.5 and above.
>
> We've searched the Spark user forum and have also been following discussions
> on the development forum and it's still unclear as which version of Hadoop
> should be used. Discussions about Spark 3.0.0 currently want to leave Hadoop
> 2.7 as the default, when there are known vulnerabilities this is a concern.
>
> What versions of Hadoop 2.X is supported, which should we be using?
>
> Thanks
>
> Jeff
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]