Ask for reviewing on Structured Streaming PRs

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Ask for reviewing on Structured Streaming PRs

Jungtaek Lim
Hi devs,

Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.

Thanks in advance,
Jungtaek Lim (HeartSaVioR)


Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Dongjin Lee
If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.

Thanks,
Dongjin


On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
Hi devs,

Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.

Thanks in advance,
Jungtaek Lim (HeartSaVioR)




--
Dongjin Lee

A hitchhiker in the mathematical world.

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

vaclavkosar

I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.

[3] https://github.com/apache/spark/pull/21919


On 12. 12. 18 14:37, Dongjin Lee wrote:
If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.

Thanks,
Dongjin


On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
Hi devs,

Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.

Thanks in advance,
Jungtaek Lim (HeartSaVioR)




--
Dongjin Lee

A hitchhiker in the mathematical world.

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Jungtaek Lim
Spark devs, happy new year!

I would like to remind this kindly, since there was actually no review after initiating the thread.

Thanks,
Jungtaek Lim (HeartSaVioR)

2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:

I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.

[3] https://github.com/apache/spark/pull/21919


On 12. 12. 18 14:37, Dongjin Lee wrote:
If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.

Thanks,
Dongjin


On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
Hi devs,

Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.

Thanks in advance,
Jungtaek Lim (HeartSaVioR)




--
Dongjin Lee

A hitchhiker in the mathematical world.

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Jungtaek Lim
I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.

2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
Spark devs, happy new year!

I would like to remind this kindly, since there was actually no review after initiating the thread.

Thanks,
Jungtaek Lim (HeartSaVioR)

2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:

I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.

[3] https://github.com/apache/spark/pull/21919


On 12. 12. 18 14:37, Dongjin Lee wrote:
If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.

Thanks,
Dongjin


On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
Hi devs,

Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.

Thanks in advance,
Jungtaek Lim (HeartSaVioR)




--
Dongjin Lee

A hitchhiker in the mathematical world.

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Sean Owen-2
Jungtaek, the best strategy is to find who wrote the code you are
modifying (use Github history or git blame) and ping them directly on
the PR. I don't know this code well myself.
It also helps if you can address why the functionality is important,
and describe compatibility implications.

Most PRs are not merged, note. Not commenting on this particular one,
but it's not a 'bug' if it's not being merged.

On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:

>
> I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
>
> 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
>>
>> Spark devs, happy new year!
>>
>> I would like to remind this kindly, since there was actually no review after initiating the thread.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
>>>
>>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
>>>
>>> [3] https://github.com/apache/spark/pull/21919
>>>
>>>
>>> On 12. 12. 18 14:37, Dongjin Lee wrote:
>>>
>>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
>>>
>>> Thanks,
>>> Dongjin
>>>
>>> [^1]: https://github.com/apache/spark/pull/22282
>>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
>>>
>>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
>>>>
>>>> Hi devs,
>>>>
>>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
>>>>
>>>> Thanks in advance,
>>>> Jungtaek Lim (HeartSaVioR)
>>>>
>>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
>>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
>>>>
>>>
>>>
>>> --
>>> Dongjin Lee
>>>
>>> A hitchhiker in the mathematical world.
>>>
>>> github: github.com/dongjinleekr
>>> linkedin: kr.linkedin.com/in/dongjinleekr
>>> speakerdeck: speakerdeck.com/dongjin

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Hyukjin Kwon
But it's true that imho there's less activity in SS in general. Should be noted. Maybe it's also because committers are busy for other stuffs.

Yea, I agree that one actionable strategy for now might be to make the PR description as clear as possible to make the review easier, and then ping them in the PRs.


On Sun, 13 Jan 2019, 10:37 pm Sean Owen <[hidden email] wrote:
Jungtaek, the best strategy is to find who wrote the code you are
modifying (use Github history or git blame) and ping them directly on
the PR. I don't know this code well myself.
It also helps if you can address why the functionality is important,
and describe compatibility implications.

Most PRs are not merged, note. Not commenting on this particular one,
but it's not a 'bug' if it's not being merged.

On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:
>
> I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
>
> 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
>>
>> Spark devs, happy new year!
>>
>> I would like to remind this kindly, since there was actually no review after initiating the thread.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
>>>
>>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
>>>
>>> [3] https://github.com/apache/spark/pull/21919
>>>
>>>
>>> On 12. 12. 18 14:37, Dongjin Lee wrote:
>>>
>>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
>>>
>>> Thanks,
>>> Dongjin
>>>
>>> [^1]: https://github.com/apache/spark/pull/22282
>>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
>>>
>>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
>>>>
>>>> Hi devs,
>>>>
>>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
>>>>
>>>> Thanks in advance,
>>>> Jungtaek Lim (HeartSaVioR)
>>>>
>>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
>>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
>>>>
>>>
>>>
>>> --
>>> Dongjin Lee
>>>
>>> A hitchhiker in the mathematical world.
>>>
>>> github: github.com/dongjinleekr
>>> linkedin: kr.linkedin.com/in/dongjinleekr
>>> speakerdeck: speakerdeck.com/dongjin

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Jungtaek Lim
In reply to this post by Sean Owen-2
Sean, this is actually a fail-back on pinging committers. I know who can review and merge in SS area, and pinged to them, didn't work. Even there's a PR which approach was encouraged by committer and reviewed the first phase, and no review.

That's not the first time I have faced the situation, and I used the fail-back approach at that time. (You can see there was no response even in the mail thread.) Not sure which approach worked.

I've observed that only (critical) bugfixes are being reviewed and merged in time for SS area. For other stuffs like new features and improvements, both discussions and PRs were pretty less popular from committers: though there was even participation/approve from non-committer community. I don't think SS is the thing to be turned into maintenance.

I guess PMC members should try to resolve such situation, as it will (slowly and quietly) make some issues like contributors leaving, module stopped growing up, etc.. The problem will grow up like a snowball: getting bigger and bigger. I don't mind if there's no interest on both contributors and committers for such module, but SS is not. Maybe either other committers who weren't familiar with should try to get familiar and cover the area, or the area needs more committers.

-Jungtaek Lim (HeartSaVioR)

2019년 1월 13일 (일) 오후 11:37, Sean Owen <[hidden email]>님이 작성:
Jungtaek, the best strategy is to find who wrote the code you are
modifying (use Github history or git blame) and ping them directly on
the PR. I don't know this code well myself.
It also helps if you can address why the functionality is important,
and describe compatibility implications.

Most PRs are not merged, note. Not commenting on this particular one,
but it's not a 'bug' if it's not being merged.

On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:
>
> I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
>
> 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
>>
>> Spark devs, happy new year!
>>
>> I would like to remind this kindly, since there was actually no review after initiating the thread.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
>>>
>>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
>>>
>>> [3] https://github.com/apache/spark/pull/21919
>>>
>>>
>>> On 12. 12. 18 14:37, Dongjin Lee wrote:
>>>
>>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
>>>
>>> Thanks,
>>> Dongjin
>>>
>>> [^1]: https://github.com/apache/spark/pull/22282
>>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
>>>
>>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
>>>>
>>>> Hi devs,
>>>>
>>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
>>>>
>>>> Thanks in advance,
>>>> Jungtaek Lim (HeartSaVioR)
>>>>
>>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
>>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
>>>>
>>>
>>>
>>> --
>>> Dongjin Lee
>>>
>>> A hitchhiker in the mathematical world.
>>>
>>> github: github.com/dongjinleekr
>>> linkedin: kr.linkedin.com/in/dongjinleekr
>>> speakerdeck: speakerdeck.com/dongjin
Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Sean Owen-2
Yes you're preaching to the choir here. SS does seem somewhat
abandoned by those that have worked on it. I have also been at times
frustrated that some areas fall into this pattern.

There isn't a way to make people work on it, and I personally am not
interested in it nor have a background in SS.

I did leave some comments on your PR and will see if we can get
comfortable with merging it, as I presume you are pretty knowledgeable
about the change.

On Sun, Jan 13, 2019 at 4:55 PM Jungtaek Lim <[hidden email]> wrote:

>
> Sean, this is actually a fail-back on pinging committers. I know who can review and merge in SS area, and pinged to them, didn't work. Even there's a PR which approach was encouraged by committer and reviewed the first phase, and no review.
>
> That's not the first time I have faced the situation, and I used the fail-back approach at that time. (You can see there was no response even in the mail thread.) Not sure which approach worked.
> https://lists.apache.org/thread.html/c61f32249949b1ff1b265c1a7148c2ea7eda08891e3016fb24008561@%3Cdev.spark.apache.org%3E
>
> I've observed that only (critical) bugfixes are being reviewed and merged in time for SS area. For other stuffs like new features and improvements, both discussions and PRs were pretty less popular from committers: though there was even participation/approve from non-committer community. I don't think SS is the thing to be turned into maintenance.
>
> I guess PMC members should try to resolve such situation, as it will (slowly and quietly) make some issues like contributors leaving, module stopped growing up, etc.. The problem will grow up like a snowball: getting bigger and bigger. I don't mind if there's no interest on both contributors and committers for such module, but SS is not. Maybe either other committers who weren't familiar with should try to get familiar and cover the area, or the area needs more committers.
>
> -Jungtaek Lim (HeartSaVioR)
>
> 2019년 1월 13일 (일) 오후 11:37, Sean Owen <[hidden email]>님이 작성:
>>
>> Jungtaek, the best strategy is to find who wrote the code you are
>> modifying (use Github history or git blame) and ping them directly on
>> the PR. I don't know this code well myself.
>> It also helps if you can address why the functionality is important,
>> and describe compatibility implications.
>>
>> Most PRs are not merged, note. Not commenting on this particular one,
>> but it's not a 'bug' if it's not being merged.
>>
>> On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:
>> >
>> > I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
>> >
>> > 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
>> >>
>> >> Spark devs, happy new year!
>> >>
>> >> I would like to remind this kindly, since there was actually no review after initiating the thread.
>> >>
>> >> Thanks,
>> >> Jungtaek Lim (HeartSaVioR)
>> >>
>> >> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
>> >>>
>> >>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
>> >>>
>> >>> [3] https://github.com/apache/spark/pull/21919
>> >>>
>> >>>
>> >>> On 12. 12. 18 14:37, Dongjin Lee wrote:
>> >>>
>> >>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
>> >>>
>> >>> Thanks,
>> >>> Dongjin
>> >>>
>> >>> [^1]: https://github.com/apache/spark/pull/22282
>> >>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
>> >>>
>> >>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
>> >>>>
>> >>>> Hi devs,
>> >>>>
>> >>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
>> >>>>
>> >>>> Thanks in advance,
>> >>>> Jungtaek Lim (HeartSaVioR)
>> >>>>
>> >>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
>> >>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> Dongjin Lee
>> >>>
>> >>> A hitchhiker in the mathematical world.
>> >>>
>> >>> github: github.com/dongjinleekr
>> >>> linkedin: kr.linkedin.com/in/dongjinleekr
>> >>> speakerdeck: speakerdeck.com/dongjin

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Cody Koeninger-2
I feel like I've already said my piece on
https://github.com/apache/spark/pull/22138 let me know if you have
more questions.

As for SS in general, I don't have a production SS deployment, so I'm
less comfortable with reviewing large changes to it.  But if no other
committers are working on it...

On Sun, Jan 13, 2019 at 5:19 PM Sean Owen <[hidden email]> wrote:

>
> Yes you're preaching to the choir here. SS does seem somewhat
> abandoned by those that have worked on it. I have also been at times
> frustrated that some areas fall into this pattern.
>
> There isn't a way to make people work on it, and I personally am not
> interested in it nor have a background in SS.
>
> I did leave some comments on your PR and will see if we can get
> comfortable with merging it, as I presume you are pretty knowledgeable
> about the change.
>
> On Sun, Jan 13, 2019 at 4:55 PM Jungtaek Lim <[hidden email]> wrote:
> >
> > Sean, this is actually a fail-back on pinging committers. I know who can review and merge in SS area, and pinged to them, didn't work. Even there's a PR which approach was encouraged by committer and reviewed the first phase, and no review.
> >
> > That's not the first time I have faced the situation, and I used the fail-back approach at that time. (You can see there was no response even in the mail thread.) Not sure which approach worked.
> > https://lists.apache.org/thread.html/c61f32249949b1ff1b265c1a7148c2ea7eda08891e3016fb24008561@%3Cdev.spark.apache.org%3E
> >
> > I've observed that only (critical) bugfixes are being reviewed and merged in time for SS area. For other stuffs like new features and improvements, both discussions and PRs were pretty less popular from committers: though there was even participation/approve from non-committer community. I don't think SS is the thing to be turned into maintenance.
> >
> > I guess PMC members should try to resolve such situation, as it will (slowly and quietly) make some issues like contributors leaving, module stopped growing up, etc.. The problem will grow up like a snowball: getting bigger and bigger. I don't mind if there's no interest on both contributors and committers for such module, but SS is not. Maybe either other committers who weren't familiar with should try to get familiar and cover the area, or the area needs more committers.
> >
> > -Jungtaek Lim (HeartSaVioR)
> >
> > 2019년 1월 13일 (일) 오후 11:37, Sean Owen <[hidden email]>님이 작성:
> >>
> >> Jungtaek, the best strategy is to find who wrote the code you are
> >> modifying (use Github history or git blame) and ping them directly on
> >> the PR. I don't know this code well myself.
> >> It also helps if you can address why the functionality is important,
> >> and describe compatibility implications.
> >>
> >> Most PRs are not merged, note. Not commenting on this particular one,
> >> but it's not a 'bug' if it's not being merged.
> >>
> >> On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:
> >> >
> >> > I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
> >> >
> >> > 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
> >> >>
> >> >> Spark devs, happy new year!
> >> >>
> >> >> I would like to remind this kindly, since there was actually no review after initiating the thread.
> >> >>
> >> >> Thanks,
> >> >> Jungtaek Lim (HeartSaVioR)
> >> >>
> >> >> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
> >> >>>
> >> >>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
> >> >>>
> >> >>> [3] https://github.com/apache/spark/pull/21919
> >> >>>
> >> >>>
> >> >>> On 12. 12. 18 14:37, Dongjin Lee wrote:
> >> >>>
> >> >>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
> >> >>>
> >> >>> Thanks,
> >> >>> Dongjin
> >> >>>
> >> >>> [^1]: https://github.com/apache/spark/pull/22282
> >> >>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
> >> >>>
> >> >>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
> >> >>>>
> >> >>>> Hi devs,
> >> >>>>
> >> >>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
> >> >>>>
> >> >>>> Thanks in advance,
> >> >>>> Jungtaek Lim (HeartSaVioR)
> >> >>>>
> >> >>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
> >> >>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
> >> >>>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Dongjin Lee
> >> >>>
> >> >>> A hitchhiker in the mathematical world.
> >> >>>
> >> >>> github: github.com/dongjinleekr
> >> >>> linkedin: kr.linkedin.com/in/dongjinleekr
> >> >>> speakerdeck: speakerdeck.com/dongjin
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Jungtaek Lim
In reply to this post by Sean Owen-2
Sad to hear that. While I understand such thing can be happened for any project, it feels me to a kind of bad sign that non-experimental major feature which has no alternative is getting lost on interest.

I also fully agree that there isn't a way to make people work on it (I also had encountered similar situation in most of projects which I involved as one of committers or PMC members), but things might get better based on how we deal with such situation: given there're some people (not only me) would like to work on SS and they're feeling stuck.

I really appreciate your help on trying to review PRs which area you're not comfortable. I understand that's not the easy one. Thanks for doing that!

2019년 1월 14일 (월) 오전 8:19, Sean Owen <[hidden email]>님이 작성:
Yes you're preaching to the choir here. SS does seem somewhat
abandoned by those that have worked on it. I have also been at times
frustrated that some areas fall into this pattern.

There isn't a way to make people work on it, and I personally am not
interested in it nor have a background in SS.

I did leave some comments on your PR and will see if we can get
comfortable with merging it, as I presume you are pretty knowledgeable
about the change.

On Sun, Jan 13, 2019 at 4:55 PM Jungtaek Lim <[hidden email]> wrote:
>
> Sean, this is actually a fail-back on pinging committers. I know who can review and merge in SS area, and pinged to them, didn't work. Even there's a PR which approach was encouraged by committer and reviewed the first phase, and no review.
>
> That's not the first time I have faced the situation, and I used the fail-back approach at that time. (You can see there was no response even in the mail thread.) Not sure which approach worked.
> https://lists.apache.org/thread.html/c61f32249949b1ff1b265c1a7148c2ea7eda08891e3016fb24008561@%3Cdev.spark.apache.org%3E
>
> I've observed that only (critical) bugfixes are being reviewed and merged in time for SS area. For other stuffs like new features and improvements, both discussions and PRs were pretty less popular from committers: though there was even participation/approve from non-committer community. I don't think SS is the thing to be turned into maintenance.
>
> I guess PMC members should try to resolve such situation, as it will (slowly and quietly) make some issues like contributors leaving, module stopped growing up, etc.. The problem will grow up like a snowball: getting bigger and bigger. I don't mind if there's no interest on both contributors and committers for such module, but SS is not. Maybe either other committers who weren't familiar with should try to get familiar and cover the area, or the area needs more committers.
>
> -Jungtaek Lim (HeartSaVioR)
>
> 2019년 1월 13일 (일) 오후 11:37, Sean Owen <[hidden email]>님이 작성:
>>
>> Jungtaek, the best strategy is to find who wrote the code you are
>> modifying (use Github history or git blame) and ping them directly on
>> the PR. I don't know this code well myself.
>> It also helps if you can address why the functionality is important,
>> and describe compatibility implications.
>>
>> Most PRs are not merged, note. Not commenting on this particular one,
>> but it's not a 'bug' if it's not being merged.
>>
>> On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:
>> >
>> > I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
>> >
>> > 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
>> >>
>> >> Spark devs, happy new year!
>> >>
>> >> I would like to remind this kindly, since there was actually no review after initiating the thread.
>> >>
>> >> Thanks,
>> >> Jungtaek Lim (HeartSaVioR)
>> >>
>> >> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
>> >>>
>> >>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
>> >>>
>> >>> [3] https://github.com/apache/spark/pull/21919
>> >>>
>> >>>
>> >>> On 12. 12. 18 14:37, Dongjin Lee wrote:
>> >>>
>> >>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
>> >>>
>> >>> Thanks,
>> >>> Dongjin
>> >>>
>> >>> [^1]: https://github.com/apache/spark/pull/22282
>> >>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
>> >>>
>> >>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
>> >>>>
>> >>>> Hi devs,
>> >>>>
>> >>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
>> >>>>
>> >>>> Thanks in advance,
>> >>>> Jungtaek Lim (HeartSaVioR)
>> >>>>
>> >>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
>> >>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> Dongjin Lee
>> >>>
>> >>> A hitchhiker in the mathematical world.
>> >>>
>> >>> github: github.com/dongjinleekr
>> >>> linkedin: kr.linkedin.com/in/dongjinleekr
>> >>> speakerdeck: speakerdeck.com/dongjin
Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Jungtaek Lim
In reply to this post by Cody Koeninger-2
Cody, I guess I already addressed your comments in the PR (#22138). The approach was changed to address your concern, and after that Gabor helped to review the PR. Please take a look again when you have time to get into.

2019년 1월 15일 (화) 오전 1:01, Cody Koeninger <[hidden email]>님이 작성:
I feel like I've already said my piece on
https://github.com/apache/spark/pull/22138 let me know if you have
more questions.

As for SS in general, I don't have a production SS deployment, so I'm
less comfortable with reviewing large changes to it.  But if no other
committers are working on it...

On Sun, Jan 13, 2019 at 5:19 PM Sean Owen <[hidden email]> wrote:
>
> Yes you're preaching to the choir here. SS does seem somewhat
> abandoned by those that have worked on it. I have also been at times
> frustrated that some areas fall into this pattern.
>
> There isn't a way to make people work on it, and I personally am not
> interested in it nor have a background in SS.
>
> I did leave some comments on your PR and will see if we can get
> comfortable with merging it, as I presume you are pretty knowledgeable
> about the change.
>
> On Sun, Jan 13, 2019 at 4:55 PM Jungtaek Lim <[hidden email]> wrote:
> >
> > Sean, this is actually a fail-back on pinging committers. I know who can review and merge in SS area, and pinged to them, didn't work. Even there's a PR which approach was encouraged by committer and reviewed the first phase, and no review.
> >
> > That's not the first time I have faced the situation, and I used the fail-back approach at that time. (You can see there was no response even in the mail thread.) Not sure which approach worked.
> > https://lists.apache.org/thread.html/c61f32249949b1ff1b265c1a7148c2ea7eda08891e3016fb24008561@%3Cdev.spark.apache.org%3E
> >
> > I've observed that only (critical) bugfixes are being reviewed and merged in time for SS area. For other stuffs like new features and improvements, both discussions and PRs were pretty less popular from committers: though there was even participation/approve from non-committer community. I don't think SS is the thing to be turned into maintenance.
> >
> > I guess PMC members should try to resolve such situation, as it will (slowly and quietly) make some issues like contributors leaving, module stopped growing up, etc.. The problem will grow up like a snowball: getting bigger and bigger. I don't mind if there's no interest on both contributors and committers for such module, but SS is not. Maybe either other committers who weren't familiar with should try to get familiar and cover the area, or the area needs more committers.
> >
> > -Jungtaek Lim (HeartSaVioR)
> >
> > 2019년 1월 13일 (일) 오후 11:37, Sean Owen <[hidden email]>님이 작성:
> >>
> >> Jungtaek, the best strategy is to find who wrote the code you are
> >> modifying (use Github history or git blame) and ping them directly on
> >> the PR. I don't know this code well myself.
> >> It also helps if you can address why the functionality is important,
> >> and describe compatibility implications.
> >>
> >> Most PRs are not merged, note. Not commenting on this particular one,
> >> but it's not a 'bug' if it's not being merged.
> >>
> >> On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:
> >> >
> >> > I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
> >> >
> >> > 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
> >> >>
> >> >> Spark devs, happy new year!
> >> >>
> >> >> I would like to remind this kindly, since there was actually no review after initiating the thread.
> >> >>
> >> >> Thanks,
> >> >> Jungtaek Lim (HeartSaVioR)
> >> >>
> >> >> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
> >> >>>
> >> >>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
> >> >>>
> >> >>> [3] https://github.com/apache/spark/pull/21919
> >> >>>
> >> >>>
> >> >>> On 12. 12. 18 14:37, Dongjin Lee wrote:
> >> >>>
> >> >>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
> >> >>>
> >> >>> Thanks,
> >> >>> Dongjin
> >> >>>
> >> >>> [^1]: https://github.com/apache/spark/pull/22282
> >> >>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
> >> >>>
> >> >>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
> >> >>>>
> >> >>>> Hi devs,
> >> >>>>
> >> >>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
> >> >>>>
> >> >>>> Thanks in advance,
> >> >>>> Jungtaek Lim (HeartSaVioR)
> >> >>>>
> >> >>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
> >> >>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
> >> >>>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Dongjin Lee
> >> >>>
> >> >>> A hitchhiker in the mathematical world.
> >> >>>
> >> >>> github: github.com/dongjinleekr
> >> >>> linkedin: kr.linkedin.com/in/dongjinleekr
> >> >>> speakerdeck: speakerdeck.com/dongjin
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>
Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Nicholas Chammas
As an observer, this thread is interesting and concerning. Is there an emerging consensus that Structured Streaming is somehow not relevant anymore? Or is it just that folks consider it "complete enough"?

Structured Streaming was billed as the replacement to DStreams. If committers, generally speaking, have lost interest in Structured Streaming, does that mean the Apache Spark project is somehow no longer aiming to provide a "first-class" solution to the problem of stream processing?

On Mon, Jan 14, 2019 at 3:43 PM Jungtaek Lim <[hidden email]> wrote:
Cody, I guess I already addressed your comments in the PR (#22138). The approach was changed to address your concern, and after that Gabor helped to review the PR. Please take a look again when you have time to get into.

2019년 1월 15일 (화) 오전 1:01, Cody Koeninger <[hidden email]>님이 작성:
I feel like I've already said my piece on
https://github.com/apache/spark/pull/22138 let me know if you have
more questions.

As for SS in general, I don't have a production SS deployment, so I'm
less comfortable with reviewing large changes to it.  But if no other
committers are working on it...

On Sun, Jan 13, 2019 at 5:19 PM Sean Owen <[hidden email]> wrote:
>
> Yes you're preaching to the choir here. SS does seem somewhat
> abandoned by those that have worked on it. I have also been at times
> frustrated that some areas fall into this pattern.
>
> There isn't a way to make people work on it, and I personally am not
> interested in it nor have a background in SS.
>
> I did leave some comments on your PR and will see if we can get
> comfortable with merging it, as I presume you are pretty knowledgeable
> about the change.
>
> On Sun, Jan 13, 2019 at 4:55 PM Jungtaek Lim <[hidden email]> wrote:
> >
> > Sean, this is actually a fail-back on pinging committers. I know who can review and merge in SS area, and pinged to them, didn't work. Even there's a PR which approach was encouraged by committer and reviewed the first phase, and no review.
> >
> > That's not the first time I have faced the situation, and I used the fail-back approach at that time. (You can see there was no response even in the mail thread.) Not sure which approach worked.
> > https://lists.apache.org/thread.html/c61f32249949b1ff1b265c1a7148c2ea7eda08891e3016fb24008561@%3Cdev.spark.apache.org%3E
> >
> > I've observed that only (critical) bugfixes are being reviewed and merged in time for SS area. For other stuffs like new features and improvements, both discussions and PRs were pretty less popular from committers: though there was even participation/approve from non-committer community. I don't think SS is the thing to be turned into maintenance.
> >
> > I guess PMC members should try to resolve such situation, as it will (slowly and quietly) make some issues like contributors leaving, module stopped growing up, etc.. The problem will grow up like a snowball: getting bigger and bigger. I don't mind if there's no interest on both contributors and committers for such module, but SS is not. Maybe either other committers who weren't familiar with should try to get familiar and cover the area, or the area needs more committers.
> >
> > -Jungtaek Lim (HeartSaVioR)
> >
> > 2019년 1월 13일 (일) 오후 11:37, Sean Owen <[hidden email]>님이 작성:
> >>
> >> Jungtaek, the best strategy is to find who wrote the code you are
> >> modifying (use Github history or git blame) and ping them directly on
> >> the PR. I don't know this code well myself.
> >> It also helps if you can address why the functionality is important,
> >> and describe compatibility implications.
> >>
> >> Most PRs are not merged, note. Not commenting on this particular one,
> >> but it's not a 'bug' if it's not being merged.
> >>
> >> On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:
> >> >
> >> > I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
> >> >
> >> > 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
> >> >>
> >> >> Spark devs, happy new year!
> >> >>
> >> >> I would like to remind this kindly, since there was actually no review after initiating the thread.
> >> >>
> >> >> Thanks,
> >> >> Jungtaek Lim (HeartSaVioR)
> >> >>
> >> >> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
> >> >>>
> >> >>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
> >> >>>
> >> >>> [3] https://github.com/apache/spark/pull/21919
> >> >>>
> >> >>>
> >> >>> On 12. 12. 18 14:37, Dongjin Lee wrote:
> >> >>>
> >> >>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
> >> >>>
> >> >>> Thanks,
> >> >>> Dongjin
> >> >>>
> >> >>> [^1]: https://github.com/apache/spark/pull/22282
> >> >>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
> >> >>>
> >> >>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
> >> >>>>
> >> >>>> Hi devs,
> >> >>>>
> >> >>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
> >> >>>>
> >> >>>> Thanks in advance,
> >> >>>> Jungtaek Lim (HeartSaVioR)
> >> >>>>
> >> >>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
> >> >>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
> >> >>>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Dongjin Lee
> >> >>>
> >> >>> A hitchhiker in the mathematical world.
> >> >>>
> >> >>> github: github.com/dongjinleekr
> >> >>> linkedin: kr.linkedin.com/in/dongjinleekr
> >> >>> speakerdeck: speakerdeck.com/dongjin
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>
Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

rxin
There are a few things to keep in mind:

1. Structured Streaming isn't an independent project. It actually (by design) depends on all the rest of Spark SQL, and virtually all improvements to Spark SQL benefit Structured Streaming.

2. The project as far as I can tell is relatively mature for core ETL and incremental processing purpose. I interact with a lot of users using it everyday. We can always expand the use cases and add more, but that also adds maintenance burden. In any case, it'd be good to get some activity here.




On Mon, Jan 14, 2019 at 5:11 PM, Nicholas Chammas <[hidden email]> wrote:
As an observer, this thread is interesting and concerning. Is there an emerging consensus that Structured Streaming is somehow not relevant anymore? Or is it just that folks consider it "complete enough"?

Structured Streaming was billed as the replacement to DStreams. If committers, generally speaking, have lost interest in Structured Streaming, does that mean the Apache Spark project is somehow no longer aiming to provide a "first-class" solution to the problem of stream processing?

On Mon, Jan 14, 2019 at 3:43 PM Jungtaek Lim <[hidden email]> wrote:
Cody, I guess I already addressed your comments in the PR (#22138). The approach was changed to address your concern, and after that Gabor helped to review the PR. Please take a look again when you have time to get into.

2019년 1월 15일 (화) 오전 1:01, Cody Koeninger <[hidden email]>님이 작성:
I feel like I've already said my piece on
https://github.com/apache/spark/pull/22138 let me know if you have
more questions.

As for SS in general, I don't have a production SS deployment, so I'm
less comfortable with reviewing large changes to it.  But if no other
committers are working on it...

On Sun, Jan 13, 2019 at 5:19 PM Sean Owen <[hidden email]> wrote:
>
> Yes you're preaching to the choir here. SS does seem somewhat
> abandoned by those that have worked on it. I have also been at times
> frustrated that some areas fall into this pattern.
>
> There isn't a way to make people work on it, and I personally am not
> interested in it nor have a background in SS.
>
> I did leave some comments on your PR and will see if we can get
> comfortable with merging it, as I presume you are pretty knowledgeable
> about the change.
>
> On Sun, Jan 13, 2019 at 4:55 PM Jungtaek Lim <[hidden email]> wrote:
> >
> > Sean, this is actually a fail-back on pinging committers. I know who can review and merge in SS area, and pinged to them, didn't work. Even there's a PR which approach was encouraged by committer and reviewed the first phase, and no review.
> >
> > That's not the first time I have faced the situation, and I used the fail-back approach at that time. (You can see there was no response even in the mail thread.) Not sure which approach worked.
> > https://lists.apache.org/thread.html/c61f32249949b1ff1b265c1a7148c2ea7eda08891e3016fb24008561@%3Cdev.spark.apache.org%3E
> >
> > I've observed that only (critical) bugfixes are being reviewed and merged in time for SS area. For other stuffs like new features and improvements, both discussions and PRs were pretty less popular from committers: though there was even participation/approve from non-committer community. I don't think SS is the thing to be turned into maintenance.
> >
> > I guess PMC members should try to resolve such situation, as it will (slowly and quietly) make some issues like contributors leaving, module stopped growing up, etc.. The problem will grow up like a snowball: getting bigger and bigger. I don't mind if there's no interest on both contributors and committers for such module, but SS is not. Maybe either other committers who weren't familiar with should try to get familiar and cover the area, or the area needs more committers.
> >
> > -Jungtaek Lim (HeartSaVioR)
> >
> > 2019년 1월 13일 (일) 오후 11:37, Sean Owen <[hidden email]>님이 작성:
> >>
> >> Jungtaek, the best strategy is to find who wrote the code you are
> >> modifying (use Github history or git blame) and ping them directly on
> >> the PR. I don't know this code well myself.
> >> It also helps if you can address why the functionality is important,
> >> and describe compatibility implications.
> >>
> >> Most PRs are not merged, note. Not commenting on this particular one,
> >> but it's not a 'bug' if it's not being merged.
> >>
> >> On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:
> >> >
> >> > I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
> >> >
> >> > 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
> >> >>
> >> >> Spark devs, happy new year!
> >> >>
> >> >> I would like to remind this kindly, since there was actually no review after initiating the thread.
> >> >>
> >> >> Thanks,
> >> >> Jungtaek Lim (HeartSaVioR)
> >> >>
> >> >> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
> >> >>>
> >> >>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
> >> >>>
> >> >>> [3] https://github.com/apache/spark/pull/21919
> >> >>>
> >> >>>
> >> >>> On 12. 12. 18 14:37, Dongjin Lee wrote:
> >> >>>
> >> >>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
> >> >>>
> >> >>> Thanks,
> >> >>> Dongjin
> >> >>>
> >> >>> [^1]: https://github.com/apache/spark/pull/22282
> >> >>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
> >> >>>
> >> >>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
> >> >>>>
> >> >>>> Hi devs,
> >> >>>>
> >> >>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
> >> >>>>
> >> >>>> Thanks in advance,
> >> >>>> Jungtaek Lim (HeartSaVioR)
> >> >>>>
> >> >>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
> >> >>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
> >> >>>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Dongjin Lee
> >> >>>
> >> >>> A hitchhiker in the mathematical world.
> >> >>>
> >> >>> github: github.com/dongjinleekr
> >> >>> linkedin: kr.linkedin.com/in/dongjinleekr
> >> >>> speakerdeck: speakerdeck.com/dongjin
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

rxin
BTW the largest change to SS right now is probably the entire data source API v2 effort, which aims to unify streaming and batch from data source perspective, and provide a reliable, expressive source/sink API.


On Mon, Jan 14, 2019 at 5:34 PM, Reynold Xin <[hidden email]> wrote:
There are a few things to keep in mind:

1. Structured Streaming isn't an independent project. It actually (by design) depends on all the rest of Spark SQL, and virtually all improvements to Spark SQL benefit Structured Streaming.

2. The project as far as I can tell is relatively mature for core ETL and incremental processing purpose. I interact with a lot of users using it everyday. We can always expand the use cases and add more, but that also adds maintenance burden. In any case, it'd be good to get some activity here.




On Mon, Jan 14, 2019 at 5:11 PM, Nicholas Chammas <[hidden email]> wrote:
As an observer, this thread is interesting and concerning. Is there an emerging consensus that Structured Streaming is somehow not relevant anymore? Or is it just that folks consider it "complete enough"?

Structured Streaming was billed as the replacement to DStreams. If committers, generally speaking, have lost interest in Structured Streaming, does that mean the Apache Spark project is somehow no longer aiming to provide a "first-class" solution to the problem of stream processing?

On Mon, Jan 14, 2019 at 3:43 PM Jungtaek Lim <[hidden email]> wrote:
Cody, I guess I already addressed your comments in the PR (#22138). The approach was changed to address your concern, and after that Gabor helped to review the PR. Please take a look again when you have time to get into.

2019년 1월 15일 (화) 오전 1:01, Cody Koeninger <[hidden email]>님이 작성:
I feel like I've already said my piece on
https://github.com/apache/spark/pull/22138 let me know if you have
more questions.

As for SS in general, I don't have a production SS deployment, so I'm
less comfortable with reviewing large changes to it.  But if no other
committers are working on it...

On Sun, Jan 13, 2019 at 5:19 PM Sean Owen <[hidden email]> wrote:
>
> Yes you're preaching to the choir here. SS does seem somewhat
> abandoned by those that have worked on it. I have also been at times
> frustrated that some areas fall into this pattern.
>
> There isn't a way to make people work on it, and I personally am not
> interested in it nor have a background in SS.
>
> I did leave some comments on your PR and will see if we can get
> comfortable with merging it, as I presume you are pretty knowledgeable
> about the change.
>
> On Sun, Jan 13, 2019 at 4:55 PM Jungtaek Lim <[hidden email]> wrote:
> >
> > Sean, this is actually a fail-back on pinging committers. I know who can review and merge in SS area, and pinged to them, didn't work. Even there's a PR which approach was encouraged by committer and reviewed the first phase, and no review.
> >
> > That's not the first time I have faced the situation, and I used the fail-back approach at that time. (You can see there was no response even in the mail thread.) Not sure which approach worked.
> > https://lists.apache.org/thread.html/c61f32249949b1ff1b265c1a7148c2ea7eda08891e3016fb24008561@%3Cdev.spark.apache.org%3E
> >
> > I've observed that only (critical) bugfixes are being reviewed and merged in time for SS area. For other stuffs like new features and improvements, both discussions and PRs were pretty less popular from committers: though there was even participation/approve from non-committer community. I don't think SS is the thing to be turned into maintenance.
> >
> > I guess PMC members should try to resolve such situation, as it will (slowly and quietly) make some issues like contributors leaving, module stopped growing up, etc.. The problem will grow up like a snowball: getting bigger and bigger. I don't mind if there's no interest on both contributors and committers for such module, but SS is not. Maybe either other committers who weren't familiar with should try to get familiar and cover the area, or the area needs more committers.
> >
> > -Jungtaek Lim (HeartSaVioR)
> >
> > 2019년 1월 13일 (일) 오후 11:37, Sean Owen <[hidden email]>님이 작성:
> >>
> >> Jungtaek, the best strategy is to find who wrote the code you are
> >> modifying (use Github history or git blame) and ping them directly on
> >> the PR. I don't know this code well myself.
> >> It also helps if you can address why the functionality is important,
> >> and describe compatibility implications.
> >>
> >> Most PRs are not merged, note. Not commenting on this particular one,
> >> but it's not a 'bug' if it's not being merged.
> >>
> >> On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:
> >> >
> >> > I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
> >> >
> >> > 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
> >> >>
> >> >> Spark devs, happy new year!
> >> >>
> >> >> I would like to remind this kindly, since there was actually no review after initiating the thread.
> >> >>
> >> >> Thanks,
> >> >> Jungtaek Lim (HeartSaVioR)
> >> >>
> >> >> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
> >> >>>
> >> >>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
> >> >>>
> >> >>> [3] https://github.com/apache/spark/pull/21919
> >> >>>
> >> >>>
> >> >>> On 12. 12. 18 14:37, Dongjin Lee wrote:
> >> >>>
> >> >>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
> >> >>>
> >> >>> Thanks,
> >> >>> Dongjin
> >> >>>
> >> >>> [^1]: https://github.com/apache/spark/pull/22282
> >> >>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
> >> >>>
> >> >>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
> >> >>>>
> >> >>>> Hi devs,
> >> >>>>
> >> >>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
> >> >>>>
> >> >>>> Thanks in advance,
> >> >>>> Jungtaek Lim (HeartSaVioR)
> >> >>>>
> >> >>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
> >> >>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
> >> >>>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Dongjin Lee
> >> >>>
> >> >>> A hitchhiker in the mathematical world.
> >> >>>
> >> >>> github: github.com/dongjinleekr
> >> >>> linkedin: kr.linkedin.com/in/dongjinleekr
> >> >>> speakerdeck: speakerdeck.com/dongjin
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Nicholas Chammas
OK, good to know, and that all makes sense. Thanks for clearing up my concern.

One of great things about Spark is, as you pointed out, that improvements to core components benefit multiple features at once.

On Mon, Jan 14, 2019 at 8:36 PM Reynold Xin <[hidden email]> wrote:
BTW the largest change to SS right now is probably the entire data source API v2 effort, which aims to unify streaming and batch from data source perspective, and provide a reliable, expressive source/sink API.


On Mon, Jan 14, 2019 at 5:34 PM, Reynold Xin <[hidden email]> wrote:
There are a few things to keep in mind:

1. Structured Streaming isn't an independent project. It actually (by design) depends on all the rest of Spark SQL, and virtually all improvements to Spark SQL benefit Structured Streaming.

2. The project as far as I can tell is relatively mature for core ETL and incremental processing purpose. I interact with a lot of users using it everyday. We can always expand the use cases and add more, but that also adds maintenance burden. In any case, it'd be good to get some activity here.




On Mon, Jan 14, 2019 at 5:11 PM, Nicholas Chammas <[hidden email]> wrote:
As an observer, this thread is interesting and concerning. Is there an emerging consensus that Structured Streaming is somehow not relevant anymore? Or is it just that folks consider it "complete enough"?

Structured Streaming was billed as the replacement to DStreams. If committers, generally speaking, have lost interest in Structured Streaming, does that mean the Apache Spark project is somehow no longer aiming to provide a "first-class" solution to the problem of stream processing?

On Mon, Jan 14, 2019 at 3:43 PM Jungtaek Lim <[hidden email]> wrote:
Cody, I guess I already addressed your comments in the PR (#22138). The approach was changed to address your concern, and after that Gabor helped to review the PR. Please take a look again when you have time to get into.

2019년 1월 15일 (화) 오전 1:01, Cody Koeninger <[hidden email]>님이 작성:
I feel like I've already said my piece on
https://github.com/apache/spark/pull/22138 let me know if you have
more questions.

As for SS in general, I don't have a production SS deployment, so I'm
less comfortable with reviewing large changes to it.  But if no other
committers are working on it...

On Sun, Jan 13, 2019 at 5:19 PM Sean Owen <[hidden email]> wrote:
>
> Yes you're preaching to the choir here. SS does seem somewhat
> abandoned by those that have worked on it. I have also been at times
> frustrated that some areas fall into this pattern.
>
> There isn't a way to make people work on it, and I personally am not
> interested in it nor have a background in SS.
>
> I did leave some comments on your PR and will see if we can get
> comfortable with merging it, as I presume you are pretty knowledgeable
> about the change.
>
> On Sun, Jan 13, 2019 at 4:55 PM Jungtaek Lim <[hidden email]> wrote:
> >
> > Sean, this is actually a fail-back on pinging committers. I know who can review and merge in SS area, and pinged to them, didn't work. Even there's a PR which approach was encouraged by committer and reviewed the first phase, and no review.
> >
> > That's not the first time I have faced the situation, and I used the fail-back approach at that time. (You can see there was no response even in the mail thread.) Not sure which approach worked.
> > https://lists.apache.org/thread.html/c61f32249949b1ff1b265c1a7148c2ea7eda08891e3016fb24008561@%3Cdev.spark.apache.org%3E
> >
> > I've observed that only (critical) bugfixes are being reviewed and merged in time for SS area. For other stuffs like new features and improvements, both discussions and PRs were pretty less popular from committers: though there was even participation/approve from non-committer community. I don't think SS is the thing to be turned into maintenance.
> >
> > I guess PMC members should try to resolve such situation, as it will (slowly and quietly) make some issues like contributors leaving, module stopped growing up, etc.. The problem will grow up like a snowball: getting bigger and bigger. I don't mind if there's no interest on both contributors and committers for such module, but SS is not. Maybe either other committers who weren't familiar with should try to get familiar and cover the area, or the area needs more committers.
> >
> > -Jungtaek Lim (HeartSaVioR)
> >
> > 2019년 1월 13일 (일) 오후 11:37, Sean Owen <[hidden email]>님이 작성:
> >>
> >> Jungtaek, the best strategy is to find who wrote the code you are
> >> modifying (use Github history or git blame) and ping them directly on
> >> the PR. I don't know this code well myself.
> >> It also helps if you can address why the functionality is important,
> >> and describe compatibility implications.
> >>
> >> Most PRs are not merged, note. Not commenting on this particular one,
> >> but it's not a 'bug' if it's not being merged.
> >>
> >> On Sun, Jan 13, 2019 at 12:29 AM Jungtaek Lim <[hidden email]> wrote:
> >> >
> >> > I'm sorry but let me remind this, as non-SS PRs are being reviewed accordingly, whereas many of SS PRs (regardless of who create) are still not reviewed and merged in time.
> >> >
> >> > 2019년 1월 3일 (목) 오전 7:57, Jungtaek Lim <[hidden email]>님이 작성:
> >> >>
> >> >> Spark devs, happy new year!
> >> >>
> >> >> I would like to remind this kindly, since there was actually no review after initiating the thread.
> >> >>
> >> >> Thanks,
> >> >> Jungtaek Lim (HeartSaVioR)
> >> >>
> >> >> 2018년 12월 12일 (수) 오후 11:12, Vaclav Kosar <[hidden email]>님이 작성:
> >> >>>
> >> >>> I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days.
> >> >>>
> >> >>> [3] https://github.com/apache/spark/pull/21919
> >> >>>
> >> >>>
> >> >>> On 12. 12. 18 14:37, Dongjin Lee wrote:
> >> >>>
> >> >>> If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark.
> >> >>>
> >> >>> Thanks,
> >> >>> Dongjin
> >> >>>
> >> >>> [^1]: https://github.com/apache/spark/pull/22282
> >> >>> [^2]: https://issues.apache.org/jira/browse/KAFKA-4208
> >> >>>
> >> >>> On Wed, Dec 12, 2018 at 6:43 PM Jungtaek Lim <[hidden email]> wrote:
> >> >>>>
> >> >>>> Hi devs,
> >> >>>>
> >> >>>> Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too.
> >> >>>>
> >> >>>> Thanks in advance,
> >> >>>> Jungtaek Lim (HeartSaVioR)
> >> >>>>
> >> >>>> 1. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+author%3AHeartSaVioR+%5BSS%5D
> >> >>>> 2. https://github.com/apache/spark/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr+%5BSS%5D+
> >> >>>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Dongjin Lee
> >> >>>
> >> >>> A hitchhiker in the mathematical world.
> >> >>>
> >> >>> github: github.com/dongjinleekr
> >> >>> linkedin: kr.linkedin.com/in/dongjinleekr
> >> >>> speakerdeck: speakerdeck.com/dongjin
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

JackyLee
Agree with rxin. Maybe we should consider about these PRs, especially those
large PRs, after DataSource V2 API is ready.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Ask for reviewing on Structured Streaming PRs

Jungtaek Lim
Yes I understand what Reynold stated (as Michael Armbrust stated earlier), and I agree it's major great thing that improvements on CORE/SQL also benefit to SS as well.

I just concerned that both of SQL / SS are being impacted with DSv2, but things are going differently between SQL and SS. SQL is still active for contributions happening which are not relevant to DSv2, SS doesn't seem to. I wish we have small time slot to keep SS active (not expecting as SQL, but review in time before author of PRs leave).

2019년 1월 15일 (화) 오전 11:00, JackyLee <[hidden email]>님이 작성:
Agree with rxin. Maybe we should consider about these PRs, especially those
large PRs, after DataSource V2 API is ready.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]