What's the root cause of not supporting multiple aggregations in structured streaming?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

What's the root cause of not supporting multiple aggregations in structured streaming?

KevinZwx
Hi there,

I'd like to know what's the root reason why multiple aggregations on streaming dataframe is not allowed since it's a very useful feature, and flink has supported it for a long time.

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

Gabor Somogyi
There is PR for this but not yet merged.

On Mon, May 20, 2019 at 10:13 AM 张万新 <[hidden email]> wrote:
Hi there,

I'd like to know what's the root reason why multiple aggregations on streaming dataframe is not allowed since it's a very useful feature, and flink has supported it for a long time.

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

Arun Mahadevan
Heres the proposal for supporting it in "append" mode - https://github.com/apache/spark/pull/23576. You could see if it addresses your requirement and post your feedback in the PR.
For "update" mode its going to be much harder to support this without first adding support for "retractions", otherwise we would end up with wrong results.

- Arun


On Mon, 20 May 2019 at 01:34, Gabor Somogyi <[hidden email]> wrote:
There is PR for this but not yet merged.

On Mon, May 20, 2019 at 10:13 AM 张万新 <[hidden email]> wrote:
Hi there,

I'd like to know what's the root reason why multiple aggregations on streaming dataframe is not allowed since it's a very useful feature, and flink has supported it for a long time.

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

KevinZwx
Thanks, I'll check it out. 

Arun Mahadevan <[hidden email]> 于 2019年5月21日周二 01:31写道:
Heres the proposal for supporting it in "append" mode - https://github.com/apache/spark/pull/23576. You could see if it addresses your requirement and post your feedback in the PR.
For "update" mode its going to be much harder to support this without first adding support for "retractions", otherwise we would end up with wrong results.

- Arun


On Mon, 20 May 2019 at 01:34, Gabor Somogyi <[hidden email]> wrote:
There is PR for this but not yet merged.

On Mon, May 20, 2019 at 10:13 AM 张万新 <[hidden email]> wrote:
Hi there,

I'd like to know what's the root reason why multiple aggregations on streaming dataframe is not allowed since it's a very useful feature, and flink has supported it for a long time.

Thanks.