Re: Catalyst dependency on Spark Core

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Catalyst dependency on Spark Core

Michael Armbrust
Yeah, sadly this dependency was introduced when someone consolidated the
logging infrastructure.  However, the dependency should be very small and
thus easy to remove, and I would like catalyst to be usable outside of
Spark.  A pull request to make this possible would be welcome.

Ideally, we'd create some sort of spark common package that has things like
logging.  That way catalyst could depend on that, without pulling in all of
Hadoop, etc.  Maybe others have opinions though, so I'm cc-ing the dev list.


On Mon, Jul 14, 2014 at 12:21 AM, Yanbo Liang <[hidden email]> wrote:

> Make Catalyst independent of Spark is the goal of Catalyst, maybe need
> time and evolution.
> I awared that package org.apache.spark.sql.catalyst.util
> embraced org.apache.spark.util.{Utils => SparkUtils},
> so that Catalyst has a dependency on Spark core.
> I'm not sure whether it will be replaced by other component independent of
> Spark in later release.
>
>
> 2014-07-14 11:51 GMT+08:00 Aniket Bhatnagar <[hidden email]>:
>
> As per the recent presentation given in Scala days (
>> http://people.apache.org/~marmbrus/talks/SparkSQLScalaDays2014.pdf), it
>> was mentioned that Catalyst is independent of Spark. But on inspecting
>> pom.xml of sql/catalyst module, it seems it has a dependency on Spark Core.
>> Any particular reason for the dependency? I would love to use Catalyst
>> outside Spark
>>
>> (reposted as previous email bounced. Sorry if this is a duplicate).
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Catalyst dependency on Spark Core

Matei Zaharia
Administrator
Yeah, I'd just add a spark-util that has these things.

Matei

On Jul 14, 2014, at 1:04 PM, Michael Armbrust <[hidden email]> wrote:

> Yeah, sadly this dependency was introduced when someone consolidated the logging infrastructure.  However, the dependency should be very small and thus easy to remove, and I would like catalyst to be usable outside of Spark.  A pull request to make this possible would be welcome.
>
> Ideally, we'd create some sort of spark common package that has things like logging.  That way catalyst could depend on that, without pulling in all of Hadoop, etc.  Maybe others have opinions though, so I'm cc-ing the dev list.
>
>
> On Mon, Jul 14, 2014 at 12:21 AM, Yanbo Liang <[hidden email]> wrote:
> Make Catalyst independent of Spark is the goal of Catalyst, maybe need time and evolution.
> I awared that package org.apache.spark.sql.catalyst.util embraced org.apache.spark.util.{Utils => SparkUtils},
> so that Catalyst has a dependency on Spark core.
> I'm not sure whether it will be replaced by other component independent of Spark in later release.
>
>
> 2014-07-14 11:51 GMT+08:00 Aniket Bhatnagar <[hidden email]>:
>
> As per the recent presentation given in Scala days (http://people.apache.org/~marmbrus/talks/SparkSQLScalaDays2014.pdf), it was mentioned that Catalyst is independent of Spark. But on inspecting pom.xml of sql/catalyst module, it seems it has a dependency on Spark Core. Any particular reason for the dependency? I would love to use Catalyst outside Spark
>
> (reposted as previous email bounced. Sorry if this is a duplicate).
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Catalyst dependency on Spark Core

Patrick Wendell
Adding new build modules is pretty high overhead, so if this is a case
where a small amount of duplicated code could get rid of the
dependency, that could also be a good short-term option.

- Patrick

On Mon, Jul 14, 2014 at 2:15 PM, Matei Zaharia <[hidden email]> wrote:

> Yeah, I'd just add a spark-util that has these things.
>
> Matei
>
> On Jul 14, 2014, at 1:04 PM, Michael Armbrust <[hidden email]>
> wrote:
>
> Yeah, sadly this dependency was introduced when someone consolidated the
> logging infrastructure.  However, the dependency should be very small and
> thus easy to remove, and I would like catalyst to be usable outside of
> Spark.  A pull request to make this possible would be welcome.
>
> Ideally, we'd create some sort of spark common package that has things like
> logging.  That way catalyst could depend on that, without pulling in all of
> Hadoop, etc.  Maybe others have opinions though, so I'm cc-ing the dev list.
>
>
> On Mon, Jul 14, 2014 at 12:21 AM, Yanbo Liang <[hidden email]> wrote:
>>
>> Make Catalyst independent of Spark is the goal of Catalyst, maybe need
>> time and evolution.
>> I awared that package org.apache.spark.sql.catalyst.util embraced
>> org.apache.spark.util.{Utils => SparkUtils},
>> so that Catalyst has a dependency on Spark core.
>> I'm not sure whether it will be replaced by other component independent of
>> Spark in later release.
>>
>>
>> 2014-07-14 11:51 GMT+08:00 Aniket Bhatnagar <[hidden email]>:
>>
>>> As per the recent presentation given in Scala days
>>> (http://people.apache.org/~marmbrus/talks/SparkSQLScalaDays2014.pdf), it was
>>> mentioned that Catalyst is independent of Spark. But on inspecting pom.xml
>>> of sql/catalyst module, it seems it has a dependency on Spark Core. Any
>>> particular reason for the dependency? I would love to use Catalyst outside
>>> Spark
>>>
>>> (reposted as previous email bounced. Sorry if this is a duplicate).
>>
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Catalyst dependency on Spark Core

Sean Owen
Agree. You end up with a "core" and a "corer core" to distinguish
between and it ends up just being more complicated. This sounds like
something that doesn't need a module.

On Tue, Jul 15, 2014 at 5:59 AM, Patrick Wendell <[hidden email]> wrote:

> Adding new build modules is pretty high overhead, so if this is a case
> where a small amount of duplicated code could get rid of the
> dependency, that could also be a good short-term option.
>
> - Patrick
>
> On Mon, Jul 14, 2014 at 2:15 PM, Matei Zaharia <[hidden email]> wrote:
>> Yeah, I'd just add a spark-util that has these things.
>>
>> Matei
>>
>> On Jul 14, 2014, at 1:04 PM, Michael Armbrust <[hidden email]>
>> wrote:
>>
>> Yeah, sadly this dependency was introduced when someone consolidated the
>> logging infrastructure.  However, the dependency should be very small and
>> thus easy to remove, and I would like catalyst to be usable outside of
>> Spark.  A pull request to make this possible would be welcome.
>>
>> Ideally, we'd create some sort of spark common package that has things like
>> logging.  That way catalyst could depend on that, without pulling in all of
>> Hadoop, etc.  Maybe others have opinions though, so I'm cc-ing the dev list.
>>
>>
>> On Mon, Jul 14, 2014 at 12:21 AM, Yanbo Liang <[hidden email]> wrote:
>>>
>>> Make Catalyst independent of Spark is the goal of Catalyst, maybe need
>>> time and evolution.
>>> I awared that package org.apache.spark.sql.catalyst.util embraced
>>> org.apache.spark.util.{Utils => SparkUtils},
>>> so that Catalyst has a dependency on Spark core.
>>> I'm not sure whether it will be replaced by other component independent of
>>> Spark in later release.
>>>
>>>
>>> 2014-07-14 11:51 GMT+08:00 Aniket Bhatnagar <[hidden email]>:
>>>
>>>> As per the recent presentation given in Scala days
>>>> (http://people.apache.org/~marmbrus/talks/SparkSQLScalaDays2014.pdf), it was
>>>> mentioned that Catalyst is independent of Spark. But on inspecting pom.xml
>>>> of sql/catalyst module, it seems it has a dependency on Spark Core. Any
>>>> particular reason for the dependency? I would love to use Catalyst outside
>>>> Spark
>>>>
>>>> (reposted as previous email bounced. Sorry if this is a duplicate).
>>>
>>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Catalyst dependency on Spark Core

Baofeng Zhang
In reply to this post by Matei Zaharia
Is Matei following this?

Catalyst uses the Utils to get the ClassLoader which loaded Spark.

Can Catalyst directly do "getClass.getClassLoader" to avoid the dependency on core?
Reply | Threaded
Open this post in threaded view
|

Re: Catalyst dependency on Spark Core

Matei Zaharia
Administrator
Yeah, that seems like something we can inline :).

On Jul 15, 2014, at 7:30 PM, Baofeng Zhang <[hidden email]> wrote:

> Is Matei following this?
>
> Catalyst uses the Utils to get the ClassLoader which loaded Spark.
>
> Can Catalyst directly do "getClass.getClassLoader" to avoid the dependency
> on core?
>
>
>
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Re-Catalyst-dependency-on-Spark-Core-tp7303p7358.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Catalyst dependency on Spark Core

Baofeng Zhang
I see.

So how about let me do this simple work to make my contribution :)

It is cooool.
Reply | Threaded
Open this post in threaded view
|

Re: Catalyst dependency on Spark Core

piyush.mukati
In reply to this post by Michael Armbrust
Hi, Is there any work going on for removing spark dependency from the
Catalyst?
Thanks



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]