Does StreamingSymmetricHashJoinExec work with watermark? I don't think so

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Does StreamingSymmetricHashJoinExec work with watermark? I don't think so

Jacek Laskowski
Hi,

I think watermark does not work for StreamingSymmetricHashJoinExec because of the following:

1. leftKeys and rightKeys have no spark.watermarkDelayMs metadata entry at planning [1]
2. Since the left and right keys had no watermark delay at planning the code [2] won't find it at execution

Is my understanding correct? If not, can you point me at examples with watermark on 1) join keys and 2) values ? Merci beaucoup.



Pozdrawiam,
Jacek Laskowski
----
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals
Reply | Threaded
Open this post in threaded view
|

Re: Does StreamingSymmetricHashJoinExec work with watermark? I don't think so

Jungtaek Lim-2
Jacek,

would you mind if I ask for the query to reproduce? Not sure I get you without having the example of "not working".

Thanks,
Jungtaek Lim (HeartSaVioR)

On Tue, Nov 12, 2019 at 12:04 AM Jacek Laskowski <[hidden email]> wrote:
Hi,

I think watermark does not work for StreamingSymmetricHashJoinExec because of the following:

1. leftKeys and rightKeys have no spark.watermarkDelayMs metadata entry at planning [1]
2. Since the left and right keys had no watermark delay at planning the code [2] won't find it at execution

Is my understanding correct? If not, can you point me at examples with watermark on 1) join keys and 2) values ? Merci beaucoup.



Pozdrawiam,
Jacek Laskowski
----
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals