is there a way for removing hadoop from spark

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

is there a way for removing hadoop from spark

Cristian Lorenzetto
Considering the case i neednt hdfs, it there a way for removing completely hadoop from spark?
Is YARN the unique dependency in spark? 
is there no java or scala (jdk langs)YARN-like lib to embed in a project instead to call external servers?
YARN lib is difficult to customize?

I made different questions for understanding what is the better way for me
Reply | Threaded
Open this post in threaded view
|

Re: is there a way for removing hadoop from spark

yohann jardin

Hey Cristian,

You don’t need to remove anything. Spark has a standalone mode. Actually that’s the default. https://spark.apache.org/docs/latest/spark-standalone.html

When building Spark (and you should build it yourself), just use the option that suits you: https://spark.apache.org/docs/latest/building-spark.html

Regards,

Yohann Jardin

Le 11-Nov-17 à 6:42 PM, Cristian Lorenzetto a écrit :
Considering the case i neednt hdfs, it there a way for removing completely hadoop from spark?
Is YARN the unique dependency in spark? 
is there no java or scala (jdk langs)YARN-like lib to embed in a project instead to call external servers?
YARN lib is difficult to customize?

I made different questions for understanding what is the better way for me

Reply | Threaded
Open this post in threaded view
|

Re: is there a way for removing hadoop from spark

Jörn Franke
In reply to this post by Cristian Lorenzetto
Why do you even mind?

> On 11. Nov 2017, at 18:42, Cristian Lorenzetto <[hidden email]> wrote:
>
> Considering the case i neednt hdfs, it there a way for removing completely hadoop from spark?
> Is YARN the unique dependency in spark?
> is there no java or scala (jdk langs)YARN-like lib to embed in a project instead to call external servers?
> YARN lib is difficult to customize?
>
> I made different questions for understanding what is the better way for me

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: is there a way for removing hadoop from spark

trsell
@Jörn Spark without Hadoop is useful
  • For using sparks programming model on a single beefy instance
  • For testing and integrating with a CI/CD pipeline.
It's ugly to have tests which depend on a cluster running somewhere.


On Sun, 12 Nov 2017 at 17:17 Jörn Franke <[hidden email]> wrote:
Why do you even mind?

> On 11. Nov 2017, at 18:42, Cristian Lorenzetto <[hidden email]> wrote:
>
> Considering the case i neednt hdfs, it there a way for removing completely hadoop from spark?
> Is YARN the unique dependency in spark?
> is there no java or scala (jdk langs)YARN-like lib to embed in a project instead to call external servers?
> YARN lib is difficult to customize?
>
> I made different questions for understanding what is the better way for me

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: is there a way for removing hadoop from spark

Jörn Franke
Within in a CI/CD pipeline I use MiniDFSCluster and MiniYarnCluster if the production cluster has also HDFS and Yarn - it has been proven as extremely useful and caught a lot of errors before going to the cluster (ie saves a lot of money).


Works fine.

On 13. Nov 2017, at 04:36, [hidden email] wrote:

@Jörn Spark without Hadoop is useful
  • For using sparks programming model on a single beefy instance
  • For testing and integrating with a CI/CD pipeline.
It's ugly to have tests which depend on a cluster running somewhere.


On Sun, 12 Nov 2017 at 17:17 Jörn Franke <[hidden email]> wrote:
Why do you even mind?

> On 11. Nov 2017, at 18:42, Cristian Lorenzetto <[hidden email]> wrote:
>
> Considering the case i neednt hdfs, it there a way for removing completely hadoop from spark?
> Is YARN the unique dependency in spark?
> is there no java or scala (jdk langs)YARN-like lib to embed in a project instead to call external servers?
> YARN lib is difficult to customize?
>
> I made different questions for understanding what is the better way for me

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: is there a way for removing hadoop from spark

Sean Owen
In reply to this post by trsell
Nothing about Spark depends on a cluster. The Hadoop client libs are required as they are part of the API but there is no need to remove that if you aren't using YARN. Indeed you can't but they're just libs. 

On Sun, Nov 12, 2017, 9:36 PM <[hidden email]> wrote:
@Jörn Spark without Hadoop is useful
  • For using sparks programming model on a single beefy instance
  • For testing and integrating with a CI/CD pipeline.
It's ugly to have tests which depend on a cluster running somewhere.


On Sun, 12 Nov 2017 at 17:17 Jörn Franke <[hidden email]> wrote:
Why do you even mind?

> On 11. Nov 2017, at 18:42, Cristian Lorenzetto <[hidden email]> wrote:
>
> Considering the case i neednt hdfs, it there a way for removing completely hadoop from spark?
> Is YARN the unique dependency in spark?
> is there no java or scala (jdk langs)YARN-like lib to embed in a project instead to call external servers?
> YARN lib is difficult to customize?
>
> I made different questions for understanding what is the better way for me

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]