[SS] KafkaSource doesn't use KafkaSourceInitialOffsetWriter for initial offsets?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[SS] KafkaSource doesn't use KafkaSourceInitialOffsetWriter for initial offsets?

Jacek Laskowski
Hi,

Just found out that KafkaSource [1] does not use KafkaSourceInitialOffsetWriter (of KafkaMicroBatchStream) [2] for initial offsets.

Any reason for that? Should I report an issue? Just checking out as I'm with 2.4.3 exclusively and have no idea what's coming for 3.0.



Pozdrawiam,
Jacek Laskowski
----
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals
Reply | Threaded
Open this post in threaded view
|

Re: [SS] KafkaSource doesn't use KafkaSourceInitialOffsetWriter for initial offsets?

Jungtaek Lim
Nice finding! I don't see any reason to not use KafkaSourceInitialOffsetWriter from KafkaSource, as they're identical. I guess it was copied and pasted sometime before and not addressed yet.
As you haven't submit a patch, I'll submit a patch shortly, with mentioning credit. I'd close mine and wait for your patch if you plan to do it. Please let me know.

Thanks,
Jungtaek Lim (HeartSaVioR)


On Mon, Aug 26, 2019 at 8:03 PM Jacek Laskowski <[hidden email]> wrote:
Hi,

Just found out that KafkaSource [1] does not use KafkaSourceInitialOffsetWriter (of KafkaMicroBatchStream) [2] for initial offsets.

Any reason for that? Should I report an issue? Just checking out as I'm with 2.4.3 exclusively and have no idea what's coming for 3.0.



Pozdrawiam,
Jacek Laskowski
----
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals


--
Reply | Threaded
Open this post in threaded view
|

Re: [SS] KafkaSource doesn't use KafkaSourceInitialOffsetWriter for initial offsets?

Gabor Somogyi
Just checked this and it's a copy-paste :) It works properly when KafkaSourceInitialOffsetWriter used. Pull me in if review needed.

BR,
G


On Mon, Aug 26, 2019 at 3:57 PM Jungtaek Lim <[hidden email]> wrote:
Nice finding! I don't see any reason to not use KafkaSourceInitialOffsetWriter from KafkaSource, as they're identical. I guess it was copied and pasted sometime before and not addressed yet.
As you haven't submit a patch, I'll submit a patch shortly, with mentioning credit. I'd close mine and wait for your patch if you plan to do it. Please let me know.

Thanks,
Jungtaek Lim (HeartSaVioR)


On Mon, Aug 26, 2019 at 8:03 PM Jacek Laskowski <[hidden email]> wrote:
Hi,

Just found out that KafkaSource [1] does not use KafkaSourceInitialOffsetWriter (of KafkaMicroBatchStream) [2] for initial offsets.

Any reason for that? Should I report an issue? Just checking out as I'm with 2.4.3 exclusively and have no idea what's coming for 3.0.



Pozdrawiam,
Jacek Laskowski
----
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals


--
Reply | Threaded
Open this post in threaded view
|

Re: [SS] KafkaSource doesn't use KafkaSourceInitialOffsetWriter for initial offsets?

Jungtaek Lim
Thanks! The patch is here: https://github.com/apache/spark/pull/25583

On Mon, Aug 26, 2019 at 11:02 PM Gabor Somogyi <[hidden email]> wrote:
Just checked this and it's a copy-paste :) It works properly when KafkaSourceInitialOffsetWriter used. Pull me in if review needed.

BR,
G


On Mon, Aug 26, 2019 at 3:57 PM Jungtaek Lim <[hidden email]> wrote:
Nice finding! I don't see any reason to not use KafkaSourceInitialOffsetWriter from KafkaSource, as they're identical. I guess it was copied and pasted sometime before and not addressed yet.
As you haven't submit a patch, I'll submit a patch shortly, with mentioning credit. I'd close mine and wait for your patch if you plan to do it. Please let me know.

Thanks,
Jungtaek Lim (HeartSaVioR)


On Mon, Aug 26, 2019 at 8:03 PM Jacek Laskowski <[hidden email]> wrote:
Hi,

Just found out that KafkaSource [1] does not use KafkaSourceInitialOffsetWriter (of KafkaMicroBatchStream) [2] for initial offsets.

Any reason for that? Should I report an issue? Just checking out as I'm with 2.4.3 exclusively and have no idea what's coming for 3.0.



Pozdrawiam,
Jacek Laskowski
----
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals


--


--
Reply | Threaded
Open this post in threaded view
|

Re: [SS] KafkaSource doesn't use KafkaSourceInitialOffsetWriter for initial offsets?

Shixiong(Ryan) Zhu
We were worried about regression when adding Kafka source v2 because it had lots of changes. Hence we copy-pasted codes to keep the Kafka source v1 untouched and provided a config to fallback to v1.

On Mon, Aug 26, 2019 at 7:05 AM Jungtaek Lim <[hidden email]> wrote:
Thanks! The patch is here: https://github.com/apache/spark/pull/25583

On Mon, Aug 26, 2019 at 11:02 PM Gabor Somogyi <[hidden email]> wrote:
Just checked this and it's a copy-paste :) It works properly when KafkaSourceInitialOffsetWriter used. Pull me in if review needed.

BR,
G


On Mon, Aug 26, 2019 at 3:57 PM Jungtaek Lim <[hidden email]> wrote:
Nice finding! I don't see any reason to not use KafkaSourceInitialOffsetWriter from KafkaSource, as they're identical. I guess it was copied and pasted sometime before and not addressed yet.
As you haven't submit a patch, I'll submit a patch shortly, with mentioning credit. I'd close mine and wait for your patch if you plan to do it. Please let me know.

Thanks,
Jungtaek Lim (HeartSaVioR)


On Mon, Aug 26, 2019 at 8:03 PM Jacek Laskowski <[hidden email]> wrote:
Hi,

Just found out that KafkaSource [1] does not use KafkaSourceInitialOffsetWriter (of KafkaMicroBatchStream) [2] for initial offsets.

Any reason for that? Should I report an issue? Just checking out as I'm with 2.4.3 exclusively and have no idea what's coming for 3.0.



Pozdrawiam,
Jacek Laskowski
----
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals


--


--
--

Best Regards,

Ryan
Reply | Threaded
Open this post in threaded view
|

Re: [SS] KafkaSource doesn't use KafkaSourceInitialOffsetWriter for initial offsets?

Gabor Somogyi-2
In reply to this post by Jungtaek Lim
OK, starting with this tomorrow...

On Mon, 26 Aug 2019, 16:05 Jungtaek Lim, <[hidden email]> wrote:
Thanks! The patch is here: https://github.com/apache/spark/pull/25583

On Mon, Aug 26, 2019 at 11:02 PM Gabor Somogyi <[hidden email]> wrote:
Just checked this and it's a copy-paste :) It works properly when KafkaSourceInitialOffsetWriter used. Pull me in if review needed.

BR,
G


On Mon, Aug 26, 2019 at 3:57 PM Jungtaek Lim <[hidden email]> wrote:
Nice finding! I don't see any reason to not use KafkaSourceInitialOffsetWriter from KafkaSource, as they're identical. I guess it was copied and pasted sometime before and not addressed yet.
As you haven't submit a patch, I'll submit a patch shortly, with mentioning credit. I'd close mine and wait for your patch if you plan to do it. Please let me know.

Thanks,
Jungtaek Lim (HeartSaVioR)


On Mon, Aug 26, 2019 at 8:03 PM Jacek Laskowski <[hidden email]> wrote:
Hi,

Just found out that KafkaSource [1] does not use KafkaSourceInitialOffsetWriter (of KafkaMicroBatchStream) [2] for initial offsets.

Any reason for that? Should I report an issue? Just checking out as I'm with 2.4.3 exclusively and have no idea what's coming for 3.0.



Pozdrawiam,
Jacek Laskowski
----
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals


--


--
Reply | Threaded
Open this post in threaded view
|

Re: [SS] KafkaSource doesn't use KafkaSourceInitialOffsetWriter for initial offsets?

Jacek Laskowski
In reply to this post by Jungtaek Lim
Hi Devs,

Thanks all for a very prompt response! That was insanely quick. Merci beaucoup! :)

Pozdrawiam,
Jacek Laskowski
----
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals


On Mon, Aug 26, 2019 at 4:05 PM Jungtaek Lim <[hidden email]> wrote:
Thanks! The patch is here: https://github.com/apache/spark/pull/25583

On Mon, Aug 26, 2019 at 11:02 PM Gabor Somogyi <[hidden email]> wrote:
Just checked this and it's a copy-paste :) It works properly when KafkaSourceInitialOffsetWriter used. Pull me in if review needed.

BR,
G


On Mon, Aug 26, 2019 at 3:57 PM Jungtaek Lim <[hidden email]> wrote:
Nice finding! I don't see any reason to not use KafkaSourceInitialOffsetWriter from KafkaSource, as they're identical. I guess it was copied and pasted sometime before and not addressed yet.
As you haven't submit a patch, I'll submit a patch shortly, with mentioning credit. I'd close mine and wait for your patch if you plan to do it. Please let me know.

Thanks,
Jungtaek Lim (HeartSaVioR)


On Mon, Aug 26, 2019 at 8:03 PM Jacek Laskowski <[hidden email]> wrote:
Hi,

Just found out that KafkaSource [1] does not use KafkaSourceInitialOffsetWriter (of KafkaMicroBatchStream) [2] for initial offsets.

Any reason for that? Should I report an issue? Just checking out as I'm with 2.4.3 exclusively and have no idea what's coming for 3.0.



Pozdrawiam,
Jacek Laskowski
----
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals


--


--