[SS] Writing a test for a possible bug in StateStoreSaveExec with Append output mode?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[SS] Writing a test for a possible bug in StateStoreSaveExec with Append output mode?

Jacek Laskowski
Hi,

I may have found a bug in StateStoreSaveExec with Append output mode
and would love proving myself I'm wrong or help squashing it by
writing a test for the case.

Is there a test for StateStoreSaveExec with Append output mode? If
not, is there a streaming test template that could be very close to a
test and that I could use?

Thanks for any help you may offer!

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [SS] Writing a test for a possible bug in StateStoreSaveExec with Append output mode?

Jacek Laskowski
Hi,

I think I know where the issue surfaces. This is with groupBy
aggregation with Append output mode.

What should happen when a state expires for a event time (in groupBy)
with the new rows for the expired key in a streaming batch exactly
when watermark has been moved up and thus expired the state for the
key?

Example's coming up.

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Sun, Sep 3, 2017 at 11:04 PM, Jacek Laskowski <[hidden email]> wrote:

> Hi,
>
> I may have found a bug in StateStoreSaveExec with Append output mode
> and would love proving myself I'm wrong or help squashing it by
> writing a test for the case.
>
> Is there a test for StateStoreSaveExec with Append output mode? If
> not, is there a streaming test template that could be very close to a
> test and that I could use?
>
> Thanks for any help you may offer!
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski
> Spark Structured Streaming (Apache Spark 2.2+)
> https://bit.ly/spark-structured-streaming
> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]