Stream Stream joins with update and complete mode

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Stream Stream joins with update and complete mode

As per the documentation
, only append mode is supported

*As of Spark 2.3, you can use joins only when the query is in Append output
mode. Other output modes are not yet supported.*

But as per the code there is no check done for the output mode

// output mode checked is missed (UnsupportedOperationChecker.scala)
case LeftOuter =>
              if (!left.isStreaming && right.isStreaming) {
                throwError("Left outer join with a streaming
DataFrame/Dataset " +
                  "on the right and a static DataFrame/Dataset on the left
is not supported")
              } else if (left.isStreaming && right.isStreaming) {
                val watermarkInJoinKeys =

                val hasValidWatermarkRange =
                    left.outputSet, right.outputSet, condition,

                if (!watermarkInJoinKeys && !hasValidWatermarkRange) {
                  throwError("Stream-stream outer join between two streaming
DataFrame/Datasets " +
                    "is not supported without a watermark in the join keys,
or a watermark on " +
                    "the nullable side and an appropriate range condition")

If the documentation is correct, then I can raise the PR to fix the code

Sent from:

To unsubscribe e-mail: [hidden email]