Auto-closing PRs or How to get reviewers' attention

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Auto-closing PRs or How to get reviewers' attention

Enrico Minack

Hi Spark Developers,

I have a fundamental question on the process of contributing to Apache Spark from outside the circle of committers.

I have gone through a number of pull requests and I always found it hard to get feedback, especially from committers. I understand there is a very high competition for getting attention of those few committers. Given Spark's code base is so huge, only very few people will feel comfortable approving code changes for a specific code section. Still, the motivation of those that want to contribute suffers from this.

In particular I am getting annoyed by the auto-closing PR feature on GitHub. I understand the usefulness of this feature for such a frequent project, but I personally am impacted by the weaknesses of this approach. I hope, this can be improved.

The feature first warns in advance that it is "... closing this PR because it hasn't been updated in a while". This comment looks a bit silly in situations where the contributor is waiting for committers' feedback.

What is the approved way to ...

... prevent it from being auto-closed? Committing and commenting to the PR does not prevent it from being closed the next day.

... re-open it? The comment says "If you'd like to revive this PR, please reopen it ...", but there is no re-open button anywhere on the PR!

... remove the Stale tag? The comment says "...  ask a committer to remove the Stale tag!". Where can I find a list of committers and their contact details? What is the best way to contact them? E-Mail? Mentioning them in a PR comment?

... find the right committer to review a PR? The contributors page states "ping likely reviewers", but it does not state how to identify likely reviewers. Do you recommend git-blaming the relevant code section? What if those committers are not available any more? Whom to ask next?

... contact committers to get their attention? Cc'ing them in PR comments? Sending E-Mails? Doesn't that contribute to their cognitive load?

What is the expected contributor's response to a PR that does not get feedback? Giving up?

Are there processes in place to increase the probability PRs do not get forgotten, auto-closed and lost?


This is not about my specific pull requests or reviewers of those. I appreciate their time and engagement.
This is about the general process of getting feedback and needed improvements for it in order to increase contributor community happiness.

Cheers,
Enrico

Reply | Threaded
Open this post in threaded view
|

Re: Auto-closing PRs or How to get reviewers' attention

Holden Karau
Git blame is a good way to figure out likely potential reviewers (eg who’s been working in the area). Another is who filed the JIRA if it’s not you.

On Thu, Feb 18, 2021 at 6:58 AM Enrico Minack <[hidden email]> wrote:

Hi Spark Developers,

I have a fundamental question on the process of contributing to Apache Spark from outside the circle of committers.

I have gone through a number of pull requests and I always found it hard to get feedback, especially from committers. I understand there is a very high competition for getting attention of those few committers. Given Spark's code base is so huge, only very few people will feel comfortable approving code changes for a specific code section. Still, the motivation of those that want to contribute suffers from this.

In particular I am getting annoyed by the auto-closing PR feature on GitHub. I understand the usefulness of this feature for such a frequent project, but I personally am impacted by the weaknesses of this approach. I hope, this can be improved.

The feature first warns in advance that it is "... closing this PR because it hasn't been updated in a while". This comment looks a bit silly in situations where the contributor is waiting for committers' feedback.

What is the approved way to ...

... prevent it from being auto-closed? Committing and commenting to the PR does not prevent it from being closed the next day.

... re-open it? The comment says "If you'd like to revive this PR, please reopen it ...", but there is no re-open button anywhere on the PR!

... remove the Stale tag? The comment says "...  ask a committer to remove the Stale tag!". Where can I find a list of committers and their contact details? What is the best way to contact them? E-Mail? Mentioning them in a PR comment?

... find the right committer to review a PR? The contributors page states "ping likely reviewers", but it does not state how to identify likely reviewers. Do you recommend git-blaming the relevant code section? What if those committers are not available any more? Whom to ask next?

... contact committers to get their attention? Cc'ing them in PR comments? Sending E-Mails? Doesn't that contribute to their cognitive load?

What is the expected contributor's response to a PR that does not get feedback? Giving up?

Are there processes in place to increase the probability PRs do not get forgotten, auto-closed and lost?


This is not about my specific pull requests or reviewers of those. I appreciate their time and engagement.
This is about the general process of getting feedback and needed improvements for it in order to increase contributor community happiness.

Cheers,
Enrico

--
Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 
Reply | Threaded
Open this post in threaded view
|

Re: Auto-closing PRs or How to get reviewers' attention

Sean Owen-2
In reply to this post by Enrico Minack
Holden is absolutely correct - pinging relevant individuals is probably your best bet. I skim the 40-50 PRs that have activity each day and look into a few that look like I would know something about by the title, but, easy to miss something I could weigh in on.

There is no way to force people to review or commit something of course. And keep in mind we get a lot of, shall we say, unuseful pull requests. There is occasionally some blowback to closing someone's PR, so the path of least resistance is often the timeout / 'soft close'. That is, it takes a lot more time to satisfactorily debate down the majority of PRs that probably shouldn't get merged, and there just isn't that much bandwidth. That said of course it's bad if lots of good PRs are getting lost in the shuffle and I am sure there are some.

One other aspect is that a committer is taking some degree of responsibility for merging a change, so the ask is more than just a few minutes of eyeballing. If it breaks something the merger pretty much owns resolving it, and, the whole project owns any consequence of the change for the future.

I think it might just be committers that can reopen at this point, not sure if that changed. But you'll probably need someone's attention anyway to make progress.

Without knowing specific PRs, I can't say whether there was a good reason, bad reason, or no particular reason it wasn't engaged. I think it's OK to float a PR or two you really believe should have gotten attention to dev@, but yeah in the end you need to find the person who has most touched that code really.

The general advice from https://spark.apache.org/contributing.html is still valuable. Clear fixes are easier to say 'yes' to than big refactorings. Improving docs, tests, existing features is better than adding big new things, etc.


On Thu, Feb 18, 2021 at 8:58 AM Enrico Minack <[hidden email]> wrote:

Hi Spark Developers,

I have a fundamental question on the process of contributing to Apache Spark from outside the circle of committers.

I have gone through a number of pull requests and I always found it hard to get feedback, especially from committers. I understand there is a very high competition for getting attention of those few committers. Given Spark's code base is so huge, only very few people will feel comfortable approving code changes for a specific code section. Still, the motivation of those that want to contribute suffers from this.

In particular I am getting annoyed by the auto-closing PR feature on GitHub. I understand the usefulness of this feature for such a frequent project, but I personally am impacted by the weaknesses of this approach. I hope, this can be improved.

The feature first warns in advance that it is "... closing this PR because it hasn't been updated in a while". This comment looks a bit silly in situations where the contributor is waiting for committers' feedback.

What is the approved way to ...

... prevent it from being auto-closed? Committing and commenting to the PR does not prevent it from being closed the next day.

... re-open it? The comment says "If you'd like to revive this PR, please reopen it ...", but there is no re-open button anywhere on the PR!

... remove the Stale tag? The comment says "...  ask a committer to remove the Stale tag!". Where can I find a list of committers and their contact details? What is the best way to contact them? E-Mail? Mentioning them in a PR comment?

... find the right committer to review a PR? The contributors page states "ping likely reviewers", but it does not state how to identify likely reviewers. Do you recommend git-blaming the relevant code section? What if those committers are not available any more? Whom to ask next?

... contact committers to get their attention? Cc'ing them in PR comments? Sending E-Mails? Doesn't that contribute to their cognitive load?

What is the expected contributor's response to a PR that does not get feedback? Giving up?

Are there processes in place to increase the probability PRs do not get forgotten, auto-closed and lost?


This is not about my specific pull requests or reviewers of those. I appreciate their time and engagement.
This is about the general process of getting feedback and needed improvements for it in order to increase contributor community happiness.

Cheers,
Enrico

Reply | Threaded
Open this post in threaded view
|

Re: Auto-closing PRs or How to get reviewers' attention

Nicholas Chammas
In reply to this post by Enrico Minack
On Thu, Feb 18, 2021 at 9:58 AM Enrico Minack <[hidden email]> wrote:

What is the approved way to ...

... prevent it from being auto-closed? Committing and commenting to the PR does not prevent it from being closed the next day.

Committing and commenting should prevent the PR from being closed. It may be that commenting after the stale message has been posted does not work (which would likely be a bug in the action or in our config), but there are PRs that have been open for months with consistent activity that do not get closed.

So at the very least, proactively committing or commenting every month will keep the PR open. However, that's not the real problem, right? The real problem is getting committer attention.

... re-open it? The comment says "If you'd like to revive this PR, please reopen it ...", but there is no re-open button anywhere on the PR!

I don't know if there is a repo setting here that allows non-committers to reopen their own closed PRs. At the very worst, you can always open a new PR from the same branch, though we should update the stale message text if contributors cannot in fact reopen their own PRs.

What is the expected contributor's response to a PR that does not get feedback? Giving up?

I've baby-sat several PRs that took months to get in. Here's an example off the top of my head (5-6 months to be merged in). I'm sure everyone on here, including most committers themselves, have had this experience. It's common. The expected response is to be persistent, to try to find a committer or shepherd for your PR, and to proactively make your PR easier to review.

Are there processes in place to increase the probability PRs do not get forgotten, auto-closed and lost?

There are things you can do as a contributor to increase the likelihood your PR will get reviewed, but I wouldn't call them "processes". This is an open source project built on corporate sponsorship for some stuff and volunteer energy for everything else. There is no guarantee or formal obligation for anyone to do the work of reviewing PRs. That's just the nature of open source work.

The things that you can do are:
  • Make your PR as small and focused as possible.
  • Make sure the build is passing and that you've followed the contributing guidelines.
  • Find the people who most recently worked on the area you're touching and ask them for help.
  • Address reviewers' requests and concerns.
  • Try to get some committer buy-in for your idea before spending time developing it.
  • Ask for input on the dev list for your PR.
Basically, most of the advice boils down to "make it easy for reviewers''. Even then, though, sometimes things won't work out (5-6 months and closed without merging). It's just the nature of contributing to a large project like Spark where there is a lot going on.
Reply | Threaded
Open this post in threaded view
|

Re: Auto-closing PRs or How to get reviewers' attention

Nicholas Chammas
In reply to this post by Sean Owen-2
On Thu, Feb 18, 2021 at 10:34 AM Sean Owen <[hidden email]> wrote:
There is no way to force people to review or commit something of course. And keep in mind we get a lot of, shall we say, unuseful pull requests. There is occasionally some blowback to closing someone's PR, so the path of least resistance is often the timeout / 'soft close'. That is, it takes a lot more time to satisfactorily debate down the majority of PRs that probably shouldn't get merged, and there just isn't that much bandwidth. That said of course it's bad if lots of good PRs are getting lost in the shuffle and I am sure there are some.

One other aspect is that a committer is taking some degree of responsibility for merging a change, so the ask is more than just a few minutes of eyeballing. If it breaks something the merger pretty much owns resolving it, and, the whole project owns any consequence of the change for the future.

+1 
Reply | Threaded
Open this post in threaded view
|

Re: Auto-closing PRs or How to get reviewers' attention

rxin
Enrico - do feel free to reopen the PRs or email people directly, unless you are told otherwise.


On Thu, Feb 18, 2021 at 9:09 AM, Nicholas Chammas <[hidden email]> wrote:
On Thu, Feb 18, 2021 at 10:34 AM Sean Owen <[hidden email]> wrote:
There is no way to force people to review or commit something of course. And keep in mind we get a lot of, shall we say, unuseful pull requests. There is occasionally some blowback to closing someone's PR, so the path of least resistance is often the timeout / 'soft close'. That is, it takes a lot more time to satisfactorily debate down the majority of PRs that probably shouldn't get merged, and there just isn't that much bandwidth. That said of course it's bad if lots of good PRs are getting lost in the shuffle and I am sure there are some.

One other aspect is that a committer is taking some degree of responsibility for merging a change, so the ask is more than just a few minutes of eyeballing. If it breaks something the merger pretty much owns resolving it, and, the whole project owns any consequence of the change for the future.

+1 


smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Auto-closing PRs or How to get reviewers' attention

Enrico Minack
In reply to this post by Sean Owen-2
Am 18.02.21 um 16:34 schrieb Sean Owen:
> One other aspect is that a committer is taking some degree of
> responsibility for merging a change, so the ask is more than just a
> few minutes of eyeballing. If it breaks something the merger pretty
> much owns resolving it, and, the whole project owns any consequence of
> the change for the future.

I think this explains the hesitation pretty well: Committers take
ownership of the change. It is understandable that PRs then have to be
very convincing with low risk/benefit ratio.

Are there plans or initiatives to proactively widen the base of
committers to mitigate the current situation?

Enrico


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Auto-closing PRs or How to get reviewers' attention

Sean Owen-2
Yes, committers are added regularly. I don't think that changes the situation for most PRs that perhaps just aren't suitable to merge.
Again the best thing you can do is make it as easy to merge as possible and find people who have touched the code for review. This often works out.

On Tue, Feb 23, 2021 at 4:06 AM Enrico Minack <[hidden email]> wrote:
Am 18.02.21 um 16:34 schrieb Sean Owen:
> One other aspect is that a committer is taking some degree of
> responsibility for merging a change, so the ask is more than just a
> few minutes of eyeballing. If it breaks something the merger pretty
> much owns resolving it, and, the whole project owns any consequence of
> the change for the future.

I think this explains the hesitation pretty well: Committers take
ownership of the change. It is understandable that PRs then have to be
very convincing with low risk/benefit ratio.

Are there plans or initiatives to proactively widen the base of
committers to mitigate the current situation?

Enrico

Reply | Threaded
Open this post in threaded view
|

Re: Auto-closing PRs or How to get reviewers' attention

MrPowers
Enrico - thanks for sharing your experience.

I recently got a couple of PRs merged and my experience was different.  I got lots of feedback from several maintainers (thank you very much!).

Can't speak to your PRs specifically, but can give the general advice that pivoting code based on maintainer feedback is probably the easiest way to get stuff merged.

I initially added an add_hours function to org.apache.spark.sql.functions and it seemed pretty clear that the maintainers weren't the biggest fans and were more in favor of a make_interval function.  I proactively closed my own add_hours PR and pushed forward make_interval instead.

In hindsight, add_hours would have been a bad addition to the API and I'm glad it got rejected.  For big, mature projects like Spark, it's more important for maintainers to reject stuff than add new functionality.  Software bloat is the main risk for Spark.

I'm of the opinion that the auto-closing PR feature is working well.  Spark maintainers have a difficult job of having to say "no" and disappoint people a lot.  Auto closing is a great way to indirectly communicate the "no" in a way that's more psychologically palatable for both the maintainer and the committer.

On Tue, Feb 23, 2021 at 9:13 AM Sean Owen <[hidden email]> wrote:
Yes, committers are added regularly. I don't think that changes the situation for most PRs that perhaps just aren't suitable to merge.
Again the best thing you can do is make it as easy to merge as possible and find people who have touched the code for review. This often works out.

On Tue, Feb 23, 2021 at 4:06 AM Enrico Minack <[hidden email]> wrote:
Am 18.02.21 um 16:34 schrieb Sean Owen:
> One other aspect is that a committer is taking some degree of
> responsibility for merging a change, so the ask is more than just a
> few minutes of eyeballing. If it breaks something the merger pretty
> much owns resolving it, and, the whole project owns any consequence of
> the change for the future.

I think this explains the hesitation pretty well: Committers take
ownership of the change. It is understandable that PRs then have to be
very convincing with low risk/benefit ratio.

Are there plans or initiatives to proactively widen the base of
committers to mitigate the current situation?

Enrico