About introduce function sum0 to Spark

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

About introduce function sum0 to Spark

Aron.tao

Hi, in calcite, has the concept of sum0, here I quote the definition of sum0:

 

Sum0 is an aggregator which returns the sum of the values which

go into it like Sum. It differs in that when no non null values

are applied zero is returned instead of null..

 

One scenario is that we can use sum0 to implement pre-calculation count(pre-calculation system like Apache Kylin).

 

It is very easy in Spark to implement sum0, if community consider this is necessary, I would like to open a JIRA and implement this.

 

---

Regards!

Aron Tao

 

Reply | Threaded
Open this post in threaded view
|

Re: About introduce function sum0 to Spark

Mark Hamstra
That's a horrible name. This is just a fold.

On Mon, Oct 22, 2018 at 7:39 PM 陶 加涛 <[hidden email]> wrote:

Hi, in calcite, has the concept of sum0, here I quote the definition of sum0:

 

Sum0 is an aggregator which returns the sum of the values which

go into it like Sum. It differs in that when no non null values

are applied zero is returned instead of null..

 

One scenario is that we can use sum0 to implement pre-calculation count(pre-calculation system like Apache Kylin).

 

It is very easy in Spark to implement sum0, if community consider this is necessary, I would like to open a JIRA and implement this.

 

---

Regards!

Aron Tao

 

Reply | Threaded
Open this post in threaded view
|

Re: About introduce function sum0 to Spark

Aron.tao

The name is from Apache Calcite, And it doesn’t matter, we can introduce our own.

 

 

---

Regards!

Aron Tao

 

发件人: Mark Hamstra <[hidden email]>
日期: 20181023 星期二 12:28
收件人: "[hidden email]" <[hidden email]>
抄送: dev <[hidden email]>
主题: Re: About introduce function sum0 to Spark

 

That's a horrible name. This is just a fold.

 

On Mon, Oct 22, 2018 at 7:39 PM 陶 加涛 <[hidden email]> wrote:

Hi, in calcite, has the concept of sum0, here I quote the definition of sum0:

 

Sum0 is an aggregator which returns the sum of the values which

go into it like Sum. It differs in that when no non null values

are applied zero is returned instead of null..

 

One scenario is that we can use sum0 to implement pre-calculation count(pre-calculation system like Apache Kylin).

 

It is very easy in Spark to implement sum0, if community consider this is necessary, I would like to open a JIRA and implement this.

 

---

Regards!

Aron Tao

 

Reply | Threaded
Open this post in threaded view
|

Re: About introduce function sum0 to Spark

cloud0fan
This is logically `sum( if(isnull(col), 0, col) )` right?

On Tue, Oct 23, 2018 at 2:58 PM 陶 加涛 <[hidden email]> wrote:

The name is from Apache Calcite, And it doesn’t matter, we can introduce our own.

 

 

---

Regards!

Aron Tao

 

发件人: Mark Hamstra <[hidden email]>
日期: 20181023 星期二 12:28
收件人: "[hidden email]" <[hidden email]>
抄送: dev <[hidden email]>
主题: Re: About introduce function sum0 to Spark

 

That's a horrible name. This is just a fold.

 

On Mon, Oct 22, 2018 at 7:39 PM 陶 加涛 <[hidden email]> wrote:

Hi, in calcite, has the concept of sum0, here I quote the definition of sum0:

 

Sum0 is an aggregator which returns the sum of the values which

go into it like Sum. It differs in that when no non null values

are applied zero is returned instead of null..

 

One scenario is that we can use sum0 to implement pre-calculation count(pre-calculation system like Apache Kylin).

 

It is very easy in Spark to implement sum0, if community consider this is necessary, I would like to open a JIRA and implement this.

 

---

Regards!

Aron Tao

 

Reply | Threaded
Open this post in threaded view
|

Re: About introduce function sum0 to Spark

Mark Hamstra
Yes, as long as you are only talking about summing numeric values. Part of my point, though, is that this is just a special case of folding or aggregating with an initial or 'zero' value. It doesn't need to be limited to just numeric sums with zero = 0.

On Tue, Oct 23, 2018 at 12:23 AM Wenchen Fan <[hidden email]> wrote:
This is logically `sum( if(isnull(col), 0, col) )` right?

On Tue, Oct 23, 2018 at 2:58 PM 陶 加涛 <[hidden email]> wrote:

The name is from Apache Calcite, And it doesn’t matter, we can introduce our own.

 

 

---

Regards!

Aron Tao

 

发件人: Mark Hamstra <[hidden email]>
日期: 20181023 星期二 12:28
收件人: "[hidden email]" <[hidden email]>
抄送: dev <[hidden email]>
主题: Re: About introduce function sum0 to Spark

 

That's a horrible name. This is just a fold.

 

On Mon, Oct 22, 2018 at 7:39 PM 陶 加涛 <[hidden email]> wrote:

Hi, in calcite, has the concept of sum0, here I quote the definition of sum0:

 

Sum0 is an aggregator which returns the sum of the values which

go into it like Sum. It differs in that when no non null values

are applied zero is returned instead of null..

 

One scenario is that we can use sum0 to implement pre-calculation count(pre-calculation system like Apache Kylin).

 

It is very easy in Spark to implement sum0, if community consider this is necessary, I would like to open a JIRA and implement this.

 

---

Regards!

Aron Tao