[VOTE] Release Spark 3.1.0 (RC1)

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[VOTE] Release Spark 3.1.0 (RC1)

Hyukjin Kwon
Please vote on releasing the following candidate as Apache Spark version 3.1.0.

The vote is open until January 8th 4PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v3.1.0-rc1 (commit 97340c1e34cfd84de445b6b7545cfa466a1baaf6):

The release files, including signatures, digests, etc. can be found at:

Signatures used for Spark RCs can be found in this file:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:

The list of bug fixes going into 3.1.0 can be found at the following URL:

This release is using the release script of the tag v3.1.0-rc1.

FAQ


=========================
How can I help test this release?
=========================

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC via "pip install https://dist.apache.org/repos/dist/dev/spark/v3.1.0-rc1-bin/pyspark-3.1.0.tar.gz"
and see if anything important breaks.
In the Java/Scala, you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with an out of date RC going forward).

===========================================
What should happen to JIRA tickets still targeting 3.1.0?
===========================================

The current list of open tickets targeted at 3.1.0 can be found at:
https://issues.apache.org/jira/projects/SPARK and search for "Target Version/s" = 3.1.0

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==================
But my bug isn't fixed?
==================

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Jungtaek Lim-2
There's an issue SPARK-33635 [1] reported due to performance regression on Kafka read between Spark 2.4 vs 3.0, which sounds like a blocker. I'll mark this as a blocker, unless anyone has different opinions.


On Wed, Jan 6, 2021 at 9:01 AM Hyukjin Kwon <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 3.1.0.

The vote is open until January 8th 4PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v3.1.0-rc1 (commit 97340c1e34cfd84de445b6b7545cfa466a1baaf6):

The release files, including signatures, digests, etc. can be found at:

Signatures used for Spark RCs can be found in this file:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:

The list of bug fixes going into 3.1.0 can be found at the following URL:

This release is using the release script of the tag v3.1.0-rc1.

FAQ


=========================
How can I help test this release?
=========================

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC via "pip install https://dist.apache.org/repos/dist/dev/spark/v3.1.0-rc1-bin/pyspark-3.1.0.tar.gz"
and see if anything important breaks.
In the Java/Scala, you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with an out of date RC going forward).

===========================================
What should happen to JIRA tickets still targeting 3.1.0?
===========================================

The current list of open tickets targeted at 3.1.0 can be found at:
https://issues.apache.org/jira/projects/SPARK and search for "Target Version/s" = 3.1.0

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==================
But my bug isn't fixed?
==================

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Hyukjin Kwon
Actually, I will mark https://issues.apache.org/jira/browse/SPARK-34021 as a blocker too. For CRAN submission, we should fix it.

2021년 1월 6일 (수) 오후 1:47, Jungtaek Lim <[hidden email]>님이 작성:
There's an issue SPARK-33635 [1] reported due to performance regression on Kafka read between Spark 2.4 vs 3.0, which sounds like a blocker. I'll mark this as a blocker, unless anyone has different opinions.


On Wed, Jan 6, 2021 at 9:01 AM Hyukjin Kwon <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 3.1.0.

The vote is open until January 8th 4PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v3.1.0-rc1 (commit 97340c1e34cfd84de445b6b7545cfa466a1baaf6):

The release files, including signatures, digests, etc. can be found at:

Signatures used for Spark RCs can be found in this file:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:

The list of bug fixes going into 3.1.0 can be found at the following URL:

This release is using the release script of the tag v3.1.0-rc1.

FAQ


=========================
How can I help test this release?
=========================

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC via "pip install https://dist.apache.org/repos/dist/dev/spark/v3.1.0-rc1-bin/pyspark-3.1.0.tar.gz"
and see if anything important breaks.
In the Java/Scala, you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with an out of date RC going forward).

===========================================
What should happen to JIRA tickets still targeting 3.1.0?
===========================================

The current list of open tickets targeted at 3.1.0 can be found at:
https://issues.apache.org/jira/projects/SPARK and search for "Target Version/s" = 3.1.0

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==================
But my bug isn't fixed?
==================

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Hyukjin Kwon
Seems like we have two PRs for both blockers, and one is already merged, nice.
I will wait for a couple of days more before starting a new RC to make sure we catch more regressions before the new RC.
Please keep testing this RC. I would appreciate it :-).

2021년 1월 6일 (수) 오후 2:28, Hyukjin Kwon <[hidden email]>님이 작성:
Actually, I will mark https://issues.apache.org/jira/browse/SPARK-34021 as a blocker too. For CRAN submission, we should fix it.

2021년 1월 6일 (수) 오후 1:47, Jungtaek Lim <[hidden email]>님이 작성:
There's an issue SPARK-33635 [1] reported due to performance regression on Kafka read between Spark 2.4 vs 3.0, which sounds like a blocker. I'll mark this as a blocker, unless anyone has different opinions.


On Wed, Jan 6, 2021 at 9:01 AM Hyukjin Kwon <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 3.1.0.

The vote is open until January 8th 4PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v3.1.0-rc1 (commit 97340c1e34cfd84de445b6b7545cfa466a1baaf6):

The release files, including signatures, digests, etc. can be found at:

Signatures used for Spark RCs can be found in this file:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:

The list of bug fixes going into 3.1.0 can be found at the following URL:

This release is using the release script of the tag v3.1.0-rc1.

FAQ


=========================
How can I help test this release?
=========================

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC via "pip install https://dist.apache.org/repos/dist/dev/spark/v3.1.0-rc1-bin/pyspark-3.1.0.tar.gz"
and see if anything important breaks.
In the Java/Scala, you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with an out of date RC going forward).

===========================================
What should happen to JIRA tickets still targeting 3.1.0?
===========================================

The current list of open tickets targeted at 3.1.0 can be found at:
https://issues.apache.org/jira/projects/SPARK and search for "Target Version/s" = 3.1.0

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==================
But my bug isn't fixed?
==================

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Jacek Laskowski
In reply to this post by Hyukjin Kwon

On Wed, Jan 6, 2021 at 1:01 AM Hyukjin Kwon <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 3.1.0.

The vote is open until January 8th 4PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v3.1.0-rc1 (commit 97340c1e34cfd84de445b6b7545cfa466a1baaf6):

The release files, including signatures, digests, etc. can be found at:

Signatures used for Spark RCs can be found in this file:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:

The list of bug fixes going into 3.1.0 can be found at the following URL:

This release is using the release script of the tag v3.1.0-rc1.

FAQ


=========================
How can I help test this release?
=========================

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC via "pip install https://dist.apache.org/repos/dist/dev/spark/v3.1.0-rc1-bin/pyspark-3.1.0.tar.gz"
and see if anything important breaks.
In the Java/Scala, you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with an out of date RC going forward).

===========================================
What should happen to JIRA tickets still targeting 3.1.0?
===========================================

The current list of open tickets targeted at 3.1.0 can be found at:
https://issues.apache.org/jira/projects/SPARK and search for "Target Version/s" = 3.1.0

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==================
But my bug isn't fixed?
==================

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Sean Owen-2
Er, yeah uh oh. Did the staging repo accidentally get closed/released? Maybe I'm also missing something.
If so then one way or the other we can't undo that as the 3.1.0 release in Maven, as far as I know. We can make a 3.1.1, but then it's kind of weird there was never any 3.1.0 source release.

We may well decide that, well, that 3.1.0 release was fine enough actually, that the blockers aren't actually that blocking. And the two mentioned so far don't sound so much like blockers.
- If the Kafka problem happened in 3.0, then it isn't a regression from 3.0 to 3.1, technically
- If CRAN wasn't working in 3.0.1, that's not a regression

Naturally, of course, we may wish to fix those fast. And, we can create a 3.1.1 release fast if desired. But if that's the extent of it - no serious regressions - we might decide to just go ahead with this RC as the 3.1.0 release and move on to 3.1.1 quickly. At the least, again, would be weird to have no 3.1.0 actual release on the ASF if we can't undo this.

FWIW the release looks OK to me as is, +1. I don't think the above are blockers.



On Wed, Jan 6, 2021 at 1:38 PM Jacek Laskowski <[hidden email]> wrote:

On Wed, Jan 6, 2021 at 1:01 AM Hyukjin Kwon <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 3.1.0.

The vote is open until January 8th 4PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v3.1.0-rc1 (commit 97340c1e34cfd84de445b6b7545cfa466a1baaf6):

The release files, including signatures, digests, etc. can be found at:

Signatures used for Spark RCs can be found in this file:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:

The list of bug fixes going into 3.1.0 can be found at the following URL:

This release is using the release script of the tag v3.1.0-rc1.

FAQ


=========================
How can I help test this release?
=========================

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC via "pip install https://dist.apache.org/repos/dist/dev/spark/v3.1.0-rc1-bin/pyspark-3.1.0.tar.gz"
and see if anything important breaks.
In the Java/Scala, you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with an out of date RC going forward).

===========================================
What should happen to JIRA tickets still targeting 3.1.0?
===========================================

The current list of open tickets targeted at 3.1.0 can be found at:
https://issues.apache.org/jira/projects/SPARK and search for "Target Version/s" = 3.1.0

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==================
But my bug isn't fixed?
==================

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Hyukjin Kwon

Yes, it was my mistake. I faced the same issue as INFRA-20651, and it is worse in my case because I misunderstood that RC and releases are separately released out.
Right after this, I filed an INFRA JIRA to revert this at INFRA-21266. We can wait and see how it goes.

Though, I know it’s impossible to remove by right. It is possible to overwrite but it will affect people who already have it in their cache.
I am thinkthing two options:

  • Skip 3.1.0 and release 3.1.1 right away since the release isn’t officially out to the main Apache repo/mirrors but only one of the downstream channels. We can just say that there was something wrong during the 3.1.0 release so it became 3.1.1 right away.
  • Release 3.1.0 out, of course, based on the vote results here. We could release 3.1.1 fast that exceptionally allows a bit of breaking changes with properly documenting it in a release note and migration guide.
I would appreciate it if I could hear other people' opinions.

Thanks.



Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Dongjoon Hyun-2
Thank you, Jacek, Sean, and Hyukjin.

The release is a human-driven process. Everyone can make mistakes.

For example, I released Apache Spark 2.2.3 with a missing pandoc, but we didn't touch it because it's a community-blessed official version.


For this incident, given the situation, I'm +1 for `Skipping 3.1.0 and starting 3.1.1 RC1` instead of making it official.

It's because

1. The vote is important in the Apache project management process.
    The existing 3.1.0 artifacts in Maven Central are not a community-blessed version.
2. Since this is the first incident, we had better build a rule to handle this kind of accident.
    If we approve `3.1.0` because it's published accidently, it could be a bad practice.
    In the worst case, a release manager can publish Spark 10.1.1 accidently in the future without votes.

BTW, thank you, Hyukjin, for your all efforts to prepare 3.1.0 as a release manager.
We know that you devote lots of your time to make it happen.

Bests,
Dongjoon.


On Wed, Jan 6, 2021 at 1:07 PM Hyukjin Kwon <[hidden email]> wrote:

Yes, it was my mistake. I faced the same issue as INFRA-20651, and it is worse in my case because I misunderstood that RC and releases are separately released out.
Right after this, I filed an INFRA JIRA to revert this at INFRA-21266. We can wait and see how it goes.

Though, I know it’s impossible to remove by right. It is possible to overwrite but it will affect people who already have it in their cache.
I am thinkthing two options:

  • Skip 3.1.0 and release 3.1.1 right away since the release isn’t officially out to the main Apache repo/mirrors but only one of the downstream channels. We can just say that there was something wrong during the 3.1.0 release so it became 3.1.1 right away.
  • Release 3.1.0 out, of course, based on the vote results here. We could release 3.1.1 fast that exceptionally allows a bit of breaking changes with properly documenting it in a release note and migration guide.
I would appreciate it if I could hear other people' opinions.

Thanks.



Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Sean Owen-2
OK, we'll have to update the release page to clarify there was never a real 3.1.0 release then.

But I'm not suggesting releasing 3.1.0 _because_ it was published accidentally. 
I'm suggesting we figure out normally whether we would have released it, and if so, great. If not, fine we must skip the version. 

On Wed, Jan 6, 2021 at 3:36 PM Dongjoon Hyun <[hidden email]> wrote:
Thank you, Jacek, Sean, and Hyukjin.

The release is a human-driven process. Everyone can make mistakes.

For example, I released Apache Spark 2.2.3 with a missing pandoc, but we didn't touch it because it's a community-blessed official version.


For this incident, given the situation, I'm +1 for `Skipping 3.1.0 and starting 3.1.1 RC1` instead of making it official.

It's because

1. The vote is important in the Apache project management process.
    The existing 3.1.0 artifacts in Maven Central are not a community-blessed version.
2. Since this is the first incident, we had better build a rule to handle this kind of accident.
    If we approve `3.1.0` because it's published accidently, it could be a bad practice.
    In the worst case, a release manager can publish Spark 10.1.1 accidently in the future without votes.

BTW, thank you, Hyukjin, for your all efforts to prepare 3.1.0 as a release manager.
We know that you devote lots of your time to make it happen.

Bests,
Dongjoon.


On Wed, Jan 6, 2021 at 1:07 PM Hyukjin Kwon <[hidden email]> wrote:

Yes, it was my mistake. I faced the same issue as INFRA-20651, and it is worse in my case because I misunderstood that RC and releases are separately released out.
Right after this, I filed an INFRA JIRA to revert this at INFRA-21266. We can wait and see how it goes.

Though, I know it’s impossible to remove by right. It is possible to overwrite but it will affect people who already have it in their cache.
I am thinkthing two options:

  • Skip 3.1.0 and release 3.1.1 right away since the release isn’t officially out to the main Apache repo/mirrors but only one of the downstream channels. We can just say that there was something wrong during the 3.1.0 release so it became 3.1.1 right away.
  • Release 3.1.0 out, of course, based on the vote results here. We could release 3.1.1 fast that exceptionally allows a bit of breaking changes with properly documenting it in a release note and migration guide.
I would appreciate it if I could hear other people' opinions.

Thanks.



Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Tom Graves-2
In reply to this post by Hyukjin Kwon
I think it makes sense to wait and see what they say on INFRA-21266.  

In the mean time hopefully people can start testing it and if no other problems found and vote passes can stay published.  It seems like the 2 issues above wouldn't be blockers in my opinion and could be handled in a 3.1.1 but others can chime too.

If we find other issues with it in testing and they can't revert in INFRA-21266 - I assume we handle by putting some documentation out there telling people not to use it and we go to 3.1.1.  

One thing I didn't follow was the comment: "release 3.1.1 fast that exceptionally allows a bit of breaking changes" - what do you mean by that?

if there is anything we can add to our release process documentation to prevent in the future that would be great as well.

Tom

On Wednesday, January 6, 2021, 03:07:26 PM CST, Hyukjin Kwon <[hidden email]> wrote:


Yes, it was my mistake. I faced the same issue as INFRA-20651, and it is worse in my case because I misunderstood that RC and releases are separately released out.
Right after this, I filed an INFRA JIRA to revert this at INFRA-21266. We can wait and see how it goes.

Though, I know it’s impossible to remove by right. It is possible to overwrite but it will affect people who already have it in their cache.
I am thinkthing two options:

  • Skip 3.1.0 and release 3.1.1 right away since the release isn’t officially out to the main Apache repo/mirrors but only one of the downstream channels. We can just say that there was something wrong during the 3.1.0 release so it became 3.1.1 right away.
  • Release 3.1.0 out, of course, based on the vote results here. We could release 3.1.1 fast that exceptionally allows a bit of breaking changes with properly documenting it in a release note and migration guide.
I would appreciate it if I could hear other people' opinions.

Thanks.




Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Hyukjin Kwon
Thanks Dongjoon, Sean and Tom. I just thought that we could have some more bug fixes or some changes if RC1 passes as a regular release due to the relatively fewer RCs.
I agree that if this RC passes, it's just that an RC passed normally per the regular process, and there's nothing wrong here. By right, there shouldn't be any special treatment or difference in 3.1.1.
I more meant a practical point that we might happen to face some more bug fixes or breaking changes (of course as an exception) that happens sometimes.


2021년 1월 7일 (목) 오전 6:44, Tom Graves <[hidden email]>님이 작성:
I think it makes sense to wait and see what they say on INFRA-21266.  

In the mean time hopefully people can start testing it and if no other problems found and vote passes can stay published.  It seems like the 2 issues above wouldn't be blockers in my opinion and could be handled in a 3.1.1 but others can chime too.

If we find other issues with it in testing and they can't revert in INFRA-21266 - I assume we handle by putting some documentation out there telling people not to use it and we go to 3.1.1.  

One thing I didn't follow was the comment: "release 3.1.1 fast that exceptionally allows a bit of breaking changes" - what do you mean by that?

if there is anything we can add to our release process documentation to prevent in the future that would be great as well.

Tom

On Wednesday, January 6, 2021, 03:07:26 PM CST, Hyukjin Kwon <[hidden email]> wrote:


Yes, it was my mistake. I faced the same issue as INFRA-20651, and it is worse in my case because I misunderstood that RC and releases are separately released out.
Right after this, I filed an INFRA JIRA to revert this at INFRA-21266. We can wait and see how it goes.

Though, I know it’s impossible to remove by right. It is possible to overwrite but it will affect people who already have it in their cache.
I am thinkthing two options:

  • Skip 3.1.0 and release 3.1.1 right away since the release isn’t officially out to the main Apache repo/mirrors but only one of the downstream channels. We can just say that there was something wrong during the 3.1.0 release so it became 3.1.1 right away.
  • Release 3.1.0 out, of course, based on the vote results here. We could release 3.1.1 fast that exceptionally allows a bit of breaking changes with properly documenting it in a release note and migration guide.
I would appreciate it if I could hear other people' opinions.

Thanks.




Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Jungtaek Lim-2
No worries about the accident. We're human beings, and everyone can make a mistake. Let's wait and see the response of INFRA-21266.

Just a 2 cents, I'm actually leaning toward to skip 3.1.0 and start the release process for 3.1.1, as anyone could be some sort of "rushing" on verification on 3.1.0. As we're already biased by the fact the release is already available, the RC might not be tested intensively and extensively. I'm also OK to continue verification on RC1 and consider this as official 3.1.0 once vote passes if we are sure to do the normal release candidate QA without bias. (I mean skipping some verifications they normally did, or consider serious bugs to be "later" based on the expectation that 3.1.1 comes pretty soon.)

On Thu, Jan 7, 2021 at 6:56 AM Hyukjin Kwon <[hidden email]> wrote:
Thanks Dongjoon, Sean and Tom. I just thought that we could have some more bug fixes or some changes if RC1 passes as a regular release due to the relatively fewer RCs.
I agree that if this RC passes, it's just that an RC passed normally per the regular process, and there's nothing wrong here. By right, there shouldn't be any special treatment or difference in 3.1.1.
I more meant a practical point that we might happen to face some more bug fixes or breaking changes (of course as an exception) that happens sometimes.


2021년 1월 7일 (목) 오전 6:44, Tom Graves <[hidden email]>님이 작성:
I think it makes sense to wait and see what they say on INFRA-21266.  

In the mean time hopefully people can start testing it and if no other problems found and vote passes can stay published.  It seems like the 2 issues above wouldn't be blockers in my opinion and could be handled in a 3.1.1 but others can chime too.

If we find other issues with it in testing and they can't revert in INFRA-21266 - I assume we handle by putting some documentation out there telling people not to use it and we go to 3.1.1.  

One thing I didn't follow was the comment: "release 3.1.1 fast that exceptionally allows a bit of breaking changes" - what do you mean by that?

if there is anything we can add to our release process documentation to prevent in the future that would be great as well.

Tom

On Wednesday, January 6, 2021, 03:07:26 PM CST, Hyukjin Kwon <[hidden email]> wrote:


Yes, it was my mistake. I faced the same issue as INFRA-20651, and it is worse in my case because I misunderstood that RC and releases are separately released out.
Right after this, I filed an INFRA JIRA to revert this at INFRA-21266. We can wait and see how it goes.

Though, I know it’s impossible to remove by right. It is possible to overwrite but it will affect people who already have it in their cache.
I am thinkthing two options:

  • Skip 3.1.0 and release 3.1.1 right away since the release isn’t officially out to the main Apache repo/mirrors but only one of the downstream channels. We can just say that there was something wrong during the 3.1.0 release so it became 3.1.1 right away.
  • Release 3.1.0 out, of course, based on the vote results here. We could release 3.1.1 fast that exceptionally allows a bit of breaking changes with properly documenting it in a release note and migration guide.
I would appreciate it if I could hear other people' opinions.

Thanks.




Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Sean Owen-2
I just don't see a reason to believe there's a rush? just test it as normal? I did, you can too, etc.
Or specifically what blocks the current RC?

On Wed, Jan 6, 2021 at 5:46 PM Jungtaek Lim <[hidden email]> wrote:
No worries about the accident. We're human beings, and everyone can make a mistake. Let's wait and see the response of INFRA-21266.

Just a 2 cents, I'm actually leaning toward to skip 3.1.0 and start the release process for 3.1.1, as anyone could be some sort of "rushing" on verification on 3.1.0. As we're already biased by the fact the release is already available, the RC might not be tested intensively and extensively. I'm also OK to continue verification on RC1 and consider this as official 3.1.0 once vote passes if we are sure to do the normal release candidate QA without bias. (I mean skipping some verifications they normally did, or consider serious bugs to be "later" based on the expectation that 3.1.1 comes pretty soon.)

On Thu, Jan 7, 2021 at 6:56 AM Hyukjin Kwon <[hidden email]> wrote:
Thanks Dongjoon, Sean and Tom. I just thought that we could have some more bug fixes or some changes if RC1 passes as a regular release due to the relatively fewer RCs.
I agree that if this RC passes, it's just that an RC passed normally per the regular process, and there's nothing wrong here. By right, there shouldn't be any special treatment or difference in 3.1.1.
I more meant a practical point that we might happen to face some more bug fixes or breaking changes (of course as an exception) that happens sometimes.


2021년 1월 7일 (목) 오전 6:44, Tom Graves <[hidden email]>님이 작성:
I think it makes sense to wait and see what they say on INFRA-21266.  

In the mean time hopefully people can start testing it and if no other problems found and vote passes can stay published.  It seems like the 2 issues above wouldn't be blockers in my opinion and could be handled in a 3.1.1 but others can chime too.

If we find other issues with it in testing and they can't revert in INFRA-21266 - I assume we handle by putting some documentation out there telling people not to use it and we go to 3.1.1.  

One thing I didn't follow was the comment: "release 3.1.1 fast that exceptionally allows a bit of breaking changes" - what do you mean by that?

if there is anything we can add to our release process documentation to prevent in the future that would be great as well.

Tom

On Wednesday, January 6, 2021, 03:07:26 PM CST, Hyukjin Kwon <[hidden email]> wrote:


Yes, it was my mistake. I faced the same issue as INFRA-20651, and it is worse in my case because I misunderstood that RC and releases are separately released out.
Right after this, I filed an INFRA JIRA to revert this at INFRA-21266. We can wait and see how it goes.

Though, I know it’s impossible to remove by right. It is possible to overwrite but it will affect people who already have it in their cache.
I am thinkthing two options:

  • Skip 3.1.0 and release 3.1.1 right away since the release isn’t officially out to the main Apache repo/mirrors but only one of the downstream channels. We can just say that there was something wrong during the 3.1.0 release so it became 3.1.1 right away.
  • Release 3.1.0 out, of course, based on the vote results here. We could release 3.1.1 fast that exceptionally allows a bit of breaking changes with properly documenting it in a release note and migration guide.
I would appreciate it if I could hear other people' opinions.

Thanks.




Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Dongjoon Hyun-2
Before we discover the pre-uploaded artifacts, both Jungtaek and Hyukjin already made two blockers shared here.
IIUC, it meant implicitly RC1 failure at that time.

In addition to that, there are two correctness issues. So, I made up my mind to cast -1 for this RC1 before joining this thread.

SPARK-34011 ALTER TABLE .. RENAME TO PARTITION doesn't refresh cache (committed after tagging)
SPARK-34027 ALTER TABLE .. RECOVER PARTITIONS doesn't refresh cache (PR is under review)

Although the above issues are not regression, those are enough for me to give -1 for 3.1.0 RC1.

On Wed, Jan 6, 2021 at 3:52 PM Sean Owen <[hidden email]> wrote:
I just don't see a reason to believe there's a rush? just test it as normal? I did, you can too, etc.
Or specifically what blocks the current RC?
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Sean Owen-2
I don't agree the first two are blockers for reasons I gave earlier.
Those two do look like important issues - are they regressions from 3.0.1?
I do agree we'd probably cut a new RC for those in any event, so agree with the plan to drop 3.1.0 (if the Maven release can't be overwritten)

On Wed, Jan 6, 2021 at 9:38 PM Dongjoon Hyun <[hidden email]> wrote:
Before we discover the pre-uploaded artifacts, both Jungtaek and Hyukjin already made two blockers shared here.
IIUC, it meant implicitly RC1 failure at that time.

In addition to that, there are two correctness issues. So, I made up my mind to cast -1 for this RC1 before joining this thread.

SPARK-34011 ALTER TABLE .. RENAME TO PARTITION doesn't refresh cache (committed after tagging)
SPARK-34027 ALTER TABLE .. RECOVER PARTITIONS doesn't refresh cache (PR is under review)

Although the above issues are not regression, those are enough for me to give -1 for 3.1.0 RC1.

On Wed, Jan 6, 2021 at 3:52 PM Sean Owen <[hidden email]> wrote:
I just don't see a reason to believe there's a rush? just test it as normal? I did, you can too, etc.
Or specifically what blocks the current RC?
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Hyukjin Kwon
I think that It would be great though if we have a clear blocker that makes the release pointless if we want to drop this RC practically given that we will schedule 3.1.1 faster - non-regression bug fixes will be delivered to end users relatively fast.
That would make it clear which option we should take. I personally don't mind dropping 3.1.0 as well; we'll have to wait for the INFRA team's response anyway.


2021년 1월 7일 (목) 오후 1:03, Sean Owen <[hidden email]>님이 작성:
I don't agree the first two are blockers for reasons I gave earlier.
Those two do look like important issues - are they regressions from 3.0.1?
I do agree we'd probably cut a new RC for those in any event, so agree with the plan to drop 3.1.0 (if the Maven release can't be overwritten)

On Wed, Jan 6, 2021 at 9:38 PM Dongjoon Hyun <[hidden email]> wrote:
Before we discover the pre-uploaded artifacts, both Jungtaek and Hyukjin already made two blockers shared here.
IIUC, it meant implicitly RC1 failure at that time.

In addition to that, there are two correctness issues. So, I made up my mind to cast -1 for this RC1 before joining this thread.

SPARK-34011 ALTER TABLE .. RENAME TO PARTITION doesn't refresh cache (committed after tagging)
SPARK-34027 ALTER TABLE .. RECOVER PARTITIONS doesn't refresh cache (PR is under review)

Although the above issues are not regression, those are enough for me to give -1 for 3.1.0 RC1.

On Wed, Jan 6, 2021 at 3:52 PM Sean Owen <[hidden email]> wrote:
I just don't see a reason to believe there's a rush? just test it as normal? I did, you can too, etc.
Or specifically what blocks the current RC?
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Hyukjin Kwon
Okay, let me just start to prepare 3.1.1. I think that will address all concerns except that 3.1.0 will remain in Maven as incomplete.
By right, removal in the Maven repo is disallowed. Overwrite is possible as far as I know but other mirrors that maintain cache will get affected.
Maven is one of the downstream publish channels, and we haven't officially announced and published it to Apache repo anyway.
I will prepare to upload news in spark-website to explain that 3.1.0 is incompletely published because there was something wrong during the release process, and we go to 3.1.1 right away.
Are we all good with this?



2021년 1월 7일 (목) 오후 1:11, Hyukjin Kwon <[hidden email]>님이 작성:
I think that It would be great though if we have a clear blocker that makes the release pointless if we want to drop this RC practically given that we will schedule 3.1.1 faster - non-regression bug fixes will be delivered to end users relatively fast.
That would make it clear which option we should take. I personally don't mind dropping 3.1.0 as well; we'll have to wait for the INFRA team's response anyway.


2021년 1월 7일 (목) 오후 1:03, Sean Owen <[hidden email]>님이 작성:
I don't agree the first two are blockers for reasons I gave earlier.
Those two do look like important issues - are they regressions from 3.0.1?
I do agree we'd probably cut a new RC for those in any event, so agree with the plan to drop 3.1.0 (if the Maven release can't be overwritten)

On Wed, Jan 6, 2021 at 9:38 PM Dongjoon Hyun <[hidden email]> wrote:
Before we discover the pre-uploaded artifacts, both Jungtaek and Hyukjin already made two blockers shared here.
IIUC, it meant implicitly RC1 failure at that time.

In addition to that, there are two correctness issues. So, I made up my mind to cast -1 for this RC1 before joining this thread.

SPARK-34011 ALTER TABLE .. RENAME TO PARTITION doesn't refresh cache (committed after tagging)
SPARK-34027 ALTER TABLE .. RECOVER PARTITIONS doesn't refresh cache (PR is under review)

Although the above issues are not regression, those are enough for me to give -1 for 3.1.0 RC1.

On Wed, Jan 6, 2021 at 3:52 PM Sean Owen <[hidden email]> wrote:
I just don't see a reason to believe there's a rush? just test it as normal? I did, you can too, etc.
Or specifically what blocks the current RC?
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

cloud0fan
I agree with Jungtaek that people are likely to be biased when testing 3.1.0. At least this will not be the same community-blessed release as previous ones, because the voting is already affected by the fact that 3.1.0 is already in maven central. Skipping 3.1.0 sounds better to me.

On Thu, Jan 7, 2021 at 12:54 PM Hyukjin Kwon <[hidden email]> wrote:
Okay, let me just start to prepare 3.1.1. I think that will address all concerns except that 3.1.0 will remain in Maven as incomplete.
By right, removal in the Maven repo is disallowed. Overwrite is possible as far as I know but other mirrors that maintain cache will get affected.
Maven is one of the downstream publish channels, and we haven't officially announced and published it to Apache repo anyway.
I will prepare to upload news in spark-website to explain that 3.1.0 is incompletely published because there was something wrong during the release process, and we go to 3.1.1 right away.
Are we all good with this?



2021년 1월 7일 (목) 오후 1:11, Hyukjin Kwon <[hidden email]>님이 작성:
I think that It would be great though if we have a clear blocker that makes the release pointless if we want to drop this RC practically given that we will schedule 3.1.1 faster - non-regression bug fixes will be delivered to end users relatively fast.
That would make it clear which option we should take. I personally don't mind dropping 3.1.0 as well; we'll have to wait for the INFRA team's response anyway.


2021년 1월 7일 (목) 오후 1:03, Sean Owen <[hidden email]>님이 작성:
I don't agree the first two are blockers for reasons I gave earlier.
Those two do look like important issues - are they regressions from 3.0.1?
I do agree we'd probably cut a new RC for those in any event, so agree with the plan to drop 3.1.0 (if the Maven release can't be overwritten)

On Wed, Jan 6, 2021 at 9:38 PM Dongjoon Hyun <[hidden email]> wrote:
Before we discover the pre-uploaded artifacts, both Jungtaek and Hyukjin already made two blockers shared here.
IIUC, it meant implicitly RC1 failure at that time.

In addition to that, there are two correctness issues. So, I made up my mind to cast -1 for this RC1 before joining this thread.

SPARK-34011 ALTER TABLE .. RENAME TO PARTITION doesn't refresh cache (committed after tagging)
SPARK-34027 ALTER TABLE .. RECOVER PARTITIONS doesn't refresh cache (PR is under review)

Although the above issues are not regression, those are enough for me to give -1 for 3.1.0 RC1.

On Wed, Jan 6, 2021 at 3:52 PM Sean Owen <[hidden email]> wrote:
I just don't see a reason to believe there's a rush? just test it as normal? I did, you can too, etc.
Or specifically what blocks the current RC?
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Holden Karau
I think that posting the 3.1.0 maven release was an accident and we're going to 3.1.1 RCs is the right step forward.
I'd ask for maybe a day before cutting the 3.1.1 release, I think https://issues.apache.org/jira/browse/SPARK-34018 is also a blocker (at first I thought it was just a test issue, but Dongjoon pointed out the NPE happens in prod too).

I'd also like to echo the: it's totally ok we all make mistakes especially in partially manual & partially automated environments, I've created a bunch of RCs labels without recognizing they were getting pushed automatically.

On Wed, Jan 6, 2021 at 8:57 PM Wenchen Fan <[hidden email]> wrote:
I agree with Jungtaek that people are likely to be biased when testing 3.1.0. At least this will not be the same community-blessed release as previous ones, because the voting is already affected by the fact that 3.1.0 is already in maven central. Skipping 3.1.0 sounds better to me.

On Thu, Jan 7, 2021 at 12:54 PM Hyukjin Kwon <[hidden email]> wrote:
Okay, let me just start to prepare 3.1.1. I think that will address all concerns except that 3.1.0 will remain in Maven as incomplete.
By right, removal in the Maven repo is disallowed. Overwrite is possible as far as I know but other mirrors that maintain cache will get affected.
Maven is one of the downstream publish channels, and we haven't officially announced and published it to Apache repo anyway.
I will prepare to upload news in spark-website to explain that 3.1.0 is incompletely published because there was something wrong during the release process, and we go to 3.1.1 right away.
Are we all good with this?



2021년 1월 7일 (목) 오후 1:11, Hyukjin Kwon <[hidden email]>님이 작성:
I think that It would be great though if we have a clear blocker that makes the release pointless if we want to drop this RC practically given that we will schedule 3.1.1 faster - non-regression bug fixes will be delivered to end users relatively fast.
That would make it clear which option we should take. I personally don't mind dropping 3.1.0 as well; we'll have to wait for the INFRA team's response anyway.


2021년 1월 7일 (목) 오후 1:03, Sean Owen <[hidden email]>님이 작성:
I don't agree the first two are blockers for reasons I gave earlier.
Those two do look like important issues - are they regressions from 3.0.1?
I do agree we'd probably cut a new RC for those in any event, so agree with the plan to drop 3.1.0 (if the Maven release can't be overwritten)

On Wed, Jan 6, 2021 at 9:38 PM Dongjoon Hyun <[hidden email]> wrote:
Before we discover the pre-uploaded artifacts, both Jungtaek and Hyukjin already made two blockers shared here.
IIUC, it meant implicitly RC1 failure at that time.

In addition to that, there are two correctness issues. So, I made up my mind to cast -1 for this RC1 before joining this thread.

SPARK-34011 ALTER TABLE .. RENAME TO PARTITION doesn't refresh cache (committed after tagging)
SPARK-34027 ALTER TABLE .. RECOVER PARTITIONS doesn't refresh cache (PR is under review)

Although the above issues are not regression, those are enough for me to give -1 for 3.1.0 RC1.

On Wed, Jan 6, 2021 at 3:52 PM Sean Owen <[hidden email]> wrote:
I just don't see a reason to believe there's a rush? just test it as normal? I did, you can too, etc.
Or specifically what blocks the current RC?


--
Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Spark 3.1.0 (RC1)

Hyukjin Kwon
Thank you Holden and Wenchen!

Let me:
- prepare a PR for news in spark-website first about 3.1.0 accident late tonight (in KST)
- and start to prepare 3.1.1 probably in few more days like next monday in case other people have different thoughts



2021년 1월 7일 (목) 오후 2:04, Holden Karau <[hidden email]>님이 작성:
I think that posting the 3.1.0 maven release was an accident and we're going to 3.1.1 RCs is the right step forward.
I'd ask for maybe a day before cutting the 3.1.1 release, I think https://issues.apache.org/jira/browse/SPARK-34018 is also a blocker (at first I thought it was just a test issue, but Dongjoon pointed out the NPE happens in prod too).

I'd also like to echo the: it's totally ok we all make mistakes especially in partially manual & partially automated environments, I've created a bunch of RCs labels without recognizing they were getting pushed automatically.

On Wed, Jan 6, 2021 at 8:57 PM Wenchen Fan <[hidden email]> wrote:
I agree with Jungtaek that people are likely to be biased when testing 3.1.0. At least this will not be the same community-blessed release as previous ones, because the voting is already affected by the fact that 3.1.0 is already in maven central. Skipping 3.1.0 sounds better to me.

On Thu, Jan 7, 2021 at 12:54 PM Hyukjin Kwon <[hidden email]> wrote:
Okay, let me just start to prepare 3.1.1. I think that will address all concerns except that 3.1.0 will remain in Maven as incomplete.
By right, removal in the Maven repo is disallowed. Overwrite is possible as far as I know but other mirrors that maintain cache will get affected.
Maven is one of the downstream publish channels, and we haven't officially announced and published it to Apache repo anyway.
I will prepare to upload news in spark-website to explain that 3.1.0 is incompletely published because there was something wrong during the release process, and we go to 3.1.1 right away.
Are we all good with this?



2021년 1월 7일 (목) 오후 1:11, Hyukjin Kwon <[hidden email]>님이 작성:
I think that It would be great though if we have a clear blocker that makes the release pointless if we want to drop this RC practically given that we will schedule 3.1.1 faster - non-regression bug fixes will be delivered to end users relatively fast.
That would make it clear which option we should take. I personally don't mind dropping 3.1.0 as well; we'll have to wait for the INFRA team's response anyway.


2021년 1월 7일 (목) 오후 1:03, Sean Owen <[hidden email]>님이 작성:
I don't agree the first two are blockers for reasons I gave earlier.
Those two do look like important issues - are they regressions from 3.0.1?
I do agree we'd probably cut a new RC for those in any event, so agree with the plan to drop 3.1.0 (if the Maven release can't be overwritten)

On Wed, Jan 6, 2021 at 9:38 PM Dongjoon Hyun <[hidden email]> wrote:
Before we discover the pre-uploaded artifacts, both Jungtaek and Hyukjin already made two blockers shared here.
IIUC, it meant implicitly RC1 failure at that time.

In addition to that, there are two correctness issues. So, I made up my mind to cast -1 for this RC1 before joining this thread.

SPARK-34011 ALTER TABLE .. RENAME TO PARTITION doesn't refresh cache (committed after tagging)
SPARK-34027 ALTER TABLE .. RECOVER PARTITIONS doesn't refresh cache (PR is under review)

Although the above issues are not regression, those are enough for me to give -1 for 3.1.0 RC1.

On Wed, Jan 6, 2021 at 3:52 PM Sean Owen <[hidden email]> wrote:
I just don't see a reason to believe there's a rush? just test it as normal? I did, you can too, etc.
Or specifically what blocks the current RC?


--
Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 
12