[VOTE] Apache Spark 2.2.0 (RC4)

classic Classic list List threaded Threaded
45 messages Options
123
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[VOTE] Apache Spark 2.2.0 (RC4)

Michael Armbrust
Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Sean Owen
Xiao opened a blocker on 2.2.0 this morning:

SPARK-20980 Rename the option `wholeFile` to `multiLine` for JSON and CSV

I don't see that this should block?

We still have 7 Critical issues:

SPARK-20520 R streaming tests failed on Windows
SPARK-20512 SparkR 2.2 QA: Programming guide, migration guide, vignettes updates
SPARK-20499 Spark MLlib, GraphX 2.2 QA umbrella
SPARK-20508 Spark R 2.2 QA umbrella
SPARK-20513 Update SparkR website for 2.2
SPARK-20510 SparkR 2.2 QA: Update user guide for new features & APIs
SPARK-20507 Update MLlib, GraphX websites for 2.2

I'm going to assume that the R test issue isn't actually that big a deal, and that the 2.2 items are done. Anything that really is for 2.2 needs to block the release; Joseph what's the status on those?

On Mon, Jun 5, 2017 at 8:15 PM Michael Armbrust <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Michael Armbrust
I commented on that JIRA, I don't think that should block the release.  We can support both options long term if this vote passes.  Looks like the remaining JIRAs are doc/website updates that can happen after the vote or QA that should be done on this RC.  I think we are ready to start testing this release seriously!

On Mon, Jun 5, 2017 at 12:40 PM, Sean Owen <[hidden email]> wrote:
Xiao opened a blocker on 2.2.0 this morning:

SPARK-20980 Rename the option `wholeFile` to `multiLine` for JSON and CSV

I don't see that this should block?

We still have 7 Critical issues:

SPARK-20520 R streaming tests failed on Windows
SPARK-20512 SparkR 2.2 QA: Programming guide, migration guide, vignettes updates
SPARK-20499 Spark MLlib, GraphX 2.2 QA umbrella
SPARK-20508 Spark R 2.2 QA umbrella
SPARK-20513 Update SparkR website for 2.2
SPARK-20510 SparkR 2.2 QA: Update user guide for new features & APIs
SPARK-20507 Update MLlib, GraphX websites for 2.2

I'm going to assume that the R test issue isn't actually that big a deal, and that the 2.2 items are done. Anything that really is for 2.2 needs to block the release; Joseph what's the status on those?

On Mon, Jun 5, 2017 at 8:15 PM Michael Armbrust <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Dong Joon Hyun

Hi, Michael.

 

Can we be more clear on deprecation messages in 2.2.0-RC4 documentation?

 

> Spark runs on Java 8+, Python 2.6+/3.4+ and R 3.1+.

    -> Python 2.7+ ?

    https://issues.apache.org/jira/browse/SPARK-12661  (Status: `Open`, Target Version: `2.2.0`, Label: `ReleaseNotes`)

 

> Note that support for Python 2.6 is deprecated as of Spark 2.0.0, and support for Scala 2.10 and versions of Hadoop before 2.6 are deprecated as of Spark 2.1.0, and may be removed in Spark 2.2.0.

    -> Support for versions of Hadoop before 2.6.5 are removed as of 2.2.0.

    -> Support for Scala 2.10 may be removed in Spark 2.3.0.

 

Since this is a doc only issue, can we revise this without affecting the RC4 vote?

 

I created a PR for this, https://github.com/apache/spark/pull/18207.

 

Bests,

Dongjoon.

 

 

From: Michael Armbrust <[hidden email]>
Date: Monday, June 5, 2017 at 12:51 PM
To: Sean Owen <[hidden email]>
Cc: "[hidden email]" <[hidden email]>
Subject: Re: [VOTE] Apache Spark 2.2.0 (RC4)

 

I commented on that JIRA, I don't think that should block the release.  We can support both options long term if this vote passes.  Looks like the remaining JIRAs are doc/website updates that can happen after the vote or QA that should be done on this RC.  I think we are ready to start testing this release seriously!

 

On Mon, Jun 5, 2017 at 12:40 PM, Sean Owen <[hidden email]> wrote:

Xiao opened a blocker on 2.2.0 this morning:

 

SPARK-20980 Rename the option `wholeFile` to `multiLine` for JSON and CSV

 

I don't see that this should block?

 

We still have 7 Critical issues:

 

SPARK-20520 R streaming tests failed on Windows

SPARK-20512 SparkR 2.2 QA: Programming guide, migration guide, vignettes updates

SPARK-20499 Spark MLlib, GraphX 2.2 QA umbrella

SPARK-20508 Spark R 2.2 QA umbrella

SPARK-20513 Update SparkR website for 2.2

SPARK-20510 SparkR 2.2 QA: Update user guide for new features & APIs

SPARK-20507 Update MLlib, GraphX websites for 2.2

 

I'm going to assume that the R test issue isn't actually that big a deal, and that the 2.2 items are done. Anything that really is for 2.2 needs to block the release; Joseph what's the status on those?

 

On Mon, Jun 5, 2017 at 8:15 PM Michael Armbrust <[hidden email]> wrote:

Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

 

[ ] +1 Release this package as Apache Spark 2.2.0

[ ] -1 Do not release this package because ...

 

 

To learn more about Apache Spark, please see http://spark.apache.org/

 

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

 

List of JIRA tickets resolved can be found with this filter.

 

The release files, including signatures, digests, etc. can be found at:

 

Release artifacts are signed with the following key:

 

The staging repository for this release can be found at:

 

The documentation corresponding to this release can be found at:

 

 

FAQ

 

How can I help test this release?

 

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

 

What should happen to JIRA tickets still targeting 2.2.0?

 

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

 

But my bug isn't fixed!??!

 

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.

 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Sean Owen
In reply to this post by Michael Armbrust
(I apologize for going on about this, but I've asked ~4 times: could you make the URLs here in the form email HTTPS URLs? It sounds minor, but we're asking people to verify the integrity of software and hashes, and this is the one case where it is actually important.)

The "2.2" JIRAs don't look like updates to the non-version-specific web pages. If they affect release docs (i.e. under spark.apache.org/docs/), or the code, those QA/doc updates have to happen before a release. Right? I feel like this is self-evident but this comes up every minor release, that some testing or doc changes for a release can happen after the code and docs for the release are finalized. They obviously can't.

I know, I get it. I think the reality is that the reporters don't believe there is something must-do for the 2.2.0 release, or else they'd have spoken up. In that case, these should be closed already as they're semantically "Blockers" and we shouldn't make an RC that can't pass.

... or should we? Actually, to me the idea of an "RC0" release as a preview, and RCs that are known to fail for testing purposes seem OK. But if that's the purpose here, let's say it.

If the "QA" JIRAs just represent that 'we will test things, in general', then I think they're superfluous at best. These aren't used consistently, and their intent isn't actionable (i.e. it sounds like no particular testing resolves the JIRA). They signal something that doesn't seem to match the intent.

Can we close the QA JIRAs -- and are there any actual must-have docs not already in the 2.2 branch?

On Mon, Jun 5, 2017 at 8:52 PM Michael Armbrust <[hidden email]> wrote:
I commented on that JIRA, I don't think that should block the release.  We can support both options long term if this vote passes.  Looks like the remaining JIRAs are doc/website updates that can happen after the vote or QA that should be done on this RC.  I think we are ready to start testing this release seriously!

On Mon, Jun 5, 2017 at 12:40 PM, Sean Owen <[hidden email]> wrote:
Xiao opened a blocker on 2.2.0 this morning:

SPARK-20980 Rename the option `wholeFile` to `multiLine` for JSON and CSV

I don't see that this should block?

We still have 7 Critical issues:

SPARK-20520 R streaming tests failed on Windows
SPARK-20512 SparkR 2.2 QA: Programming guide, migration guide, vignettes updates
SPARK-20499 Spark MLlib, GraphX 2.2 QA umbrella
SPARK-20508 Spark R 2.2 QA umbrella
SPARK-20513 Update SparkR website for 2.2
SPARK-20510 SparkR 2.2 QA: Update user guide for new features & APIs
SPARK-20507 Update MLlib, GraphX websites for 2.2

I'm going to assume that the R test issue isn't actually that big a deal, and that the 2.2 items are done. Anything that really is for 2.2 needs to block the release; Joseph what's the status on those?

On Mon, Jun 5, 2017 at 8:15 PM Michael Armbrust <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Sean Owen
In reply to this post by Michael Armbrust
On the latest Ubuntu, Java 8, with -Phive -Phadoop-2.7 -Pyarn, this passes all tests. It's looking good, pending a double-check on the outstanding JIRA questions.

All the hashes and sigs are correct.

On Mon, Jun 5, 2017 at 8:15 PM Michael Armbrust <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Michael Armbrust
In reply to this post by Sean Owen
Apologies for messing up the https urls.  My mistake.  I'll try to get it right next time.

Regarding the readiness of this and previous RCs.  I did cut RC1 & RC2 knowing that they were unlikely to pass.  That said, I still think these early RCs are valuable. I know several users that wanted to test new features in 2.2 that have used them.  Now, if we would prefer to call them preview or RC0 or something I'd be okay with that as well.

Regarding doc updates, I don't think it is a requirement that they be voted on as part of the release.  Even if they are something version specific.  I think we have regularly updated the website with documentation that was merged after the release.

I personally don't think the QA umbrella JIRAs are particularly effective, but I also wouldn't ban their use if others think they are.  However, I do think that real QA needs an RC to test, so I think it is fine that there is still outstanding QA to be done when an RC is cut.  For example, I plan to run a bunch of streaming workloads on RC4 and will vote accordingly.

TLDR; Based on what I have heard from everyone so far, there are currently no know issues that should fail the vote here.  We should begin testing RC4.  Thanks to everyone for your help!

On Mon, Jun 5, 2017 at 1:20 PM, Sean Owen <[hidden email]> wrote:
(I apologize for going on about this, but I've asked ~4 times: could you make the URLs here in the form email HTTPS URLs? It sounds minor, but we're asking people to verify the integrity of software and hashes, and this is the one case where it is actually important.)

The "2.2" JIRAs don't look like updates to the non-version-specific web pages. If they affect release docs (i.e. under spark.apache.org/docs/), or the code, those QA/doc updates have to happen before a release. Right? I feel like this is self-evident but this comes up every minor release, that some testing or doc changes for a release can happen after the code and docs for the release are finalized. They obviously can't.

I know, I get it. I think the reality is that the reporters don't believe there is something must-do for the 2.2.0 release, or else they'd have spoken up. In that case, these should be closed already as they're semantically "Blockers" and we shouldn't make an RC that can't pass.

... or should we? Actually, to me the idea of an "RC0" release as a preview, and RCs that are known to fail for testing purposes seem OK. But if that's the purpose here, let's say it.

If the "QA" JIRAs just represent that 'we will test things, in general', then I think they're superfluous at best. These aren't used consistently, and their intent isn't actionable (i.e. it sounds like no particular testing resolves the JIRA). They signal something that doesn't seem to match the intent.

Can we close the QA JIRAs -- and are there any actual must-have docs not already in the 2.2 branch?

On Mon, Jun 5, 2017 at 8:52 PM Michael Armbrust <[hidden email]> wrote:
I commented on that JIRA, I don't think that should block the release.  We can support both options long term if this vote passes.  Looks like the remaining JIRAs are doc/website updates that can happen after the vote or QA that should be done on this RC.  I think we are ready to start testing this release seriously!

On Mon, Jun 5, 2017 at 12:40 PM, Sean Owen <[hidden email]> wrote:
Xiao opened a blocker on 2.2.0 this morning:

SPARK-20980 Rename the option `wholeFile` to `multiLine` for JSON and CSV

I don't see that this should block?

We still have 7 Critical issues:

SPARK-20520 R streaming tests failed on Windows
SPARK-20512 SparkR 2.2 QA: Programming guide, migration guide, vignettes updates
SPARK-20499 Spark MLlib, GraphX 2.2 QA umbrella
SPARK-20508 Spark R 2.2 QA umbrella
SPARK-20513 Update SparkR website for 2.2
SPARK-20510 SparkR 2.2 QA: Update user guide for new features & APIs
SPARK-20507 Update MLlib, GraphX websites for 2.2

I'm going to assume that the R test issue isn't actually that big a deal, and that the 2.2 items are done. Anything that really is for 2.2 needs to block the release; Joseph what's the status on those?

On Mon, Jun 5, 2017 at 8:15 PM Michael Armbrust <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Kazuaki Ishizaki
In reply to this post by Michael Armbrust
+1 (non-binding)

I tested it on Ubuntu 16.04 and OpenJDK8 on ppc64le. All of the tests for core have passed.

$ java -version
openjdk version "1.8.0_111"
OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-2ubuntu0.16.04.2-b14)
OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode)
$ build/mvn -DskipTests -Phive -Phive-thriftserver -Pyarn -Phadoop-2.7 package install
$ build/mvn -Phive -Phive-thriftserver -Pyarn -Phadoop-2.7 test -pl core
...
Run completed in 15 minutes, 30 seconds.
Total number of tests run: 1959
Suites: completed 206, aborted 0
Tests: succeeded 1959, failed 0, canceled 4, ignored 8, pending 0
All tests passed.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 17:16 min
[INFO] Finished at: 2017-06-06T13:44:48+09:00
[INFO] Final Memory: 53M/510M
[INFO] ------------------------------------------------------------------------
[WARNING] The requested profile "hive" could not be activated because it does not exist.

Kazuaki Ishizaki



From:        Michael Armbrust <[hidden email]>
To:        "[hidden email]" <[hidden email]>
Date:        2017/06/06 04:15
Subject:        [VOTE] Apache Spark 2.2.0 (RC4)




Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:
http://home.apache.org/~pwendell/spark-releases/spark-2.2.0-rc4-bin/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1241/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-releases/spark-2.2.0-rc4-docs/


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Sean Owen
In reply to this post by Michael Armbrust
On Tue, Jun 6, 2017 at 1:06 AM Michael Armbrust <[hidden email]> wrote:
Regarding the readiness of this and previous RCs.  I did cut RC1 & RC2 knowing that they were unlikely to pass.  That said, I still think these early RCs are valuable. I know several users that wanted to test new features in 2.2 that have used them.  Now, if we would prefer to call them preview or RC0 or something I'd be okay with that as well.

They are valuable, I only suggest it's better to note explicitly when there are blockers or must-do tasks that will fail the RC. It makes a big difference to whether one would like to +1.

I meant more than just calling them something different. An early RC could be voted as a released 'preview' artifact, at the start of the notional QA period, with a lower bar to passing, and releasable with known issues. This encourages more testing. It also resolves the controversy about whether it's OK to include an RC in a product (separate thread). 
 

Regarding doc updates, I don't think it is a requirement that they be voted on as part of the release.  Even if they are something version specific.  I think we have regularly updated the website with documentation that was merged after the release.

They're part of the source release too, as markdown, and should be voted on. I've never understood otherwise. Have we actually released docs and then later changed them, so that they don't match the release? I don't recall that, but I do recall updating the non-version-specific website.

Aside from the oddity of having docs generated from x.y source not match docs published for x.y, you want the same protections for doc source that the project distributes as anything else. It's not just correctness, but liability. The hypothetical is always that someone included copyrighted text or something without permission and now the project can't rely on the argument that it made a good-faith effort to review what it released on the site. Someone becomes personally liable.

These are pretty technical reasons though. More practically, what's the hurry to release if docs aren't done (_if_ they're not done)? It's being presented as normal practice, but seems quite exceptional.

 
I personally don't think the QA umbrella JIRAs are particularly effective, but I also wouldn't ban their use if others think they are.  However, I do think that real QA needs an RC to test, so I think it is fine that there is still outstanding QA to be done when an RC is cut.  For example, I plan to run a bunch of streaming workloads on RC4 and will vote accordingly.

QA on RCs is great (see above). The problem is, I can't distinguish between a JIRA that means "we must test in general", which sounds like something you too would ignore, and one that means "there is specific functionality we have to check before a release that we haven't looked at yet", which is a committer waving a flag that they implicitly do not want a release until resolved. I wouldn't +1 a release that had a Blocker software defect one of us reported. 

I know I'm harping on this, but this is the one mechanism we do use consistently (Blocker JIRAs) to clearly communicate about issues vital to a go / no-go release decision, and I think this interferes. The rest of JIRA noise doesn't matter much. You can see we're already resorting to secondary communications as a result ("anyone have any issues that need to be fixed before I cut another RC?" emails) because this is kind of ignored, and think we're swapping out a decent mechanism for worse one.

I suspect, as you do, that there's no to-do here in which case they should be resolved and we're still on track for release. I'd wait on +1 until then.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Nick Pentreath
The website updates for ML QA (SPARK-20507) are not actually critical as the project website certainly can be updated separately from the source code guide and is not part of the release to be voted on. In future that particular work item for the QA process could be marked down in priority, and is definitely not a release blocker.

In any event I just resolved SPARK-20507, as I don't believe any website updates are required for this release anyway. That fully resolves the ML QA umbrella (SPARK-20499).


On Tue, 6 Jun 2017 at 10:16 Sean Owen <[hidden email]> wrote:
On Tue, Jun 6, 2017 at 1:06 AM Michael Armbrust <[hidden email]> wrote:
Regarding the readiness of this and previous RCs.  I did cut RC1 & RC2 knowing that they were unlikely to pass.  That said, I still think these early RCs are valuable. I know several users that wanted to test new features in 2.2 that have used them.  Now, if we would prefer to call them preview or RC0 or something I'd be okay with that as well.

They are valuable, I only suggest it's better to note explicitly when there are blockers or must-do tasks that will fail the RC. It makes a big difference to whether one would like to +1.

I meant more than just calling them something different. An early RC could be voted as a released 'preview' artifact, at the start of the notional QA period, with a lower bar to passing, and releasable with known issues. This encourages more testing. It also resolves the controversy about whether it's OK to include an RC in a product (separate thread). 
 

Regarding doc updates, I don't think it is a requirement that they be voted on as part of the release.  Even if they are something version specific.  I think we have regularly updated the website with documentation that was merged after the release.

They're part of the source release too, as markdown, and should be voted on. I've never understood otherwise. Have we actually released docs and then later changed them, so that they don't match the release? I don't recall that, but I do recall updating the non-version-specific website.

Aside from the oddity of having docs generated from x.y source not match docs published for x.y, you want the same protections for doc source that the project distributes as anything else. It's not just correctness, but liability. The hypothetical is always that someone included copyrighted text or something without permission and now the project can't rely on the argument that it made a good-faith effort to review what it released on the site. Someone becomes personally liable.

These are pretty technical reasons though. More practically, what's the hurry to release if docs aren't done (_if_ they're not done)? It's being presented as normal practice, but seems quite exceptional.

 
I personally don't think the QA umbrella JIRAs are particularly effective, but I also wouldn't ban their use if others think they are.  However, I do think that real QA needs an RC to test, so I think it is fine that there is still outstanding QA to be done when an RC is cut.  For example, I plan to run a bunch of streaming workloads on RC4 and will vote accordingly.

QA on RCs is great (see above). The problem is, I can't distinguish between a JIRA that means "we must test in general", which sounds like something you too would ignore, and one that means "there is specific functionality we have to check before a release that we haven't looked at yet", which is a committer waving a flag that they implicitly do not want a release until resolved. I wouldn't +1 a release that had a Blocker software defect one of us reported. 

I know I'm harping on this, but this is the one mechanism we do use consistently (Blocker JIRAs) to clearly communicate about issues vital to a go / no-go release decision, and I think this interferes. The rest of JIRA noise doesn't matter much. You can see we're already resorting to secondary communications as a result ("anyone have any issues that need to be fixed before I cut another RC?" emails) because this is kind of ignored, and think we're swapping out a decent mechanism for worse one.

I suspect, as you do, that there's no to-do here in which case they should be resolved and we're still on track for release. I'd wait on +1 until then.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Nick Pentreath
In reply to this post by Sean Owen
Now, on the subject of (ML) QA JIRAs.

From the ML side, I believe they are required (I think others such as Joseph will agree and in fact have already said as much).

Most are marked as Blockers, though of those the Python API coverage is strictly not a Blocker as we will never hold the release for API parity issues (unless of course there is some critical bug or missing thing, but that really falls under the standard RC bug triage process).

I believe they are Blockers, since they involve auditing binary compat and new public APIs, visibility issues, Java compat etc. I think it's obvious that a RC should not pass if these have not been checked.

I actually agree that docs and user guide are absolutely part of the release, and in fact are one of the more important pieces of the release. Apart from the issues Sean mentions, not treating these things are critical issues or even blockers is what inevitably over time leads to the user guide being out of date, missing important features, etc.

In practice for ML at least we definitely aim to have all the doc / guide issues done before the final release.

Now in terms of process, none of these QA issues really require an RC, they can all be carried out once the release branch is cut. Some of the issues like binary compat are perhaps a bit more tricky but inevitably involves manually checking through MiMa exclusions added, to verify they are ok, etc - so again an actual RC is not required here.

So really the answer is to more aggressively burn down these QA issues the moment the release branch has been cut. Again, I think this echoes what Joseph has said in previous threads.



On Tue, 6 Jun 2017 at 10:16 Sean Owen <[hidden email]> wrote:
On Tue, Jun 6, 2017 at 1:06 AM Michael Armbrust <[hidden email]> wrote:
Regarding the readiness of this and previous RCs.  I did cut RC1 & RC2 knowing that they were unlikely to pass.  That said, I still think these early RCs are valuable. I know several users that wanted to test new features in 2.2 that have used them.  Now, if we would prefer to call them preview or RC0 or something I'd be okay with that as well.

They are valuable, I only suggest it's better to note explicitly when there are blockers or must-do tasks that will fail the RC. It makes a big difference to whether one would like to +1.

I meant more than just calling them something different. An early RC could be voted as a released 'preview' artifact, at the start of the notional QA period, with a lower bar to passing, and releasable with known issues. This encourages more testing. It also resolves the controversy about whether it's OK to include an RC in a product (separate thread). 
 

Regarding doc updates, I don't think it is a requirement that they be voted on as part of the release.  Even if they are something version specific.  I think we have regularly updated the website with documentation that was merged after the release.

They're part of the source release too, as markdown, and should be voted on. I've never understood otherwise. Have we actually released docs and then later changed them, so that they don't match the release? I don't recall that, but I do recall updating the non-version-specific website.

Aside from the oddity of having docs generated from x.y source not match docs published for x.y, you want the same protections for doc source that the project distributes as anything else. It's not just correctness, but liability. The hypothetical is always that someone included copyrighted text or something without permission and now the project can't rely on the argument that it made a good-faith effort to review what it released on the site. Someone becomes personally liable.

These are pretty technical reasons though. More practically, what's the hurry to release if docs aren't done (_if_ they're not done)? It's being presented as normal practice, but seems quite exceptional.

 
I personally don't think the QA umbrella JIRAs are particularly effective, but I also wouldn't ban their use if others think they are.  However, I do think that real QA needs an RC to test, so I think it is fine that there is still outstanding QA to be done when an RC is cut.  For example, I plan to run a bunch of streaming workloads on RC4 and will vote accordingly.

QA on RCs is great (see above). The problem is, I can't distinguish between a JIRA that means "we must test in general", which sounds like something you too would ignore, and one that means "there is specific functionality we have to check before a release that we haven't looked at yet", which is a committer waving a flag that they implicitly do not want a release until resolved. I wouldn't +1 a release that had a Blocker software defect one of us reported. 

I know I'm harping on this, but this is the one mechanism we do use consistently (Blocker JIRAs) to clearly communicate about issues vital to a go / no-go release decision, and I think this interferes. The rest of JIRA noise doesn't matter much. You can see we're already resorting to secondary communications as a result ("anyone have any issues that need to be fixed before I cut another RC?" emails) because this is kind of ignored, and think we're swapping out a decent mechanism for worse one.

I suspect, as you do, that there's no to-do here in which case they should be resolved and we're still on track for release. I'd wait on +1 until then.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Are release docs part of a release?

Sean Owen
In reply to this post by Nick Pentreath
That's good, but, I think we should agree on whether release docs are part of a release. It's important to reasoning about releases.

To be clear, you're suggesting that, say, right now you are OK with updating this page with a few more paragraphs? http://spark.apache.org/docs/2.1.0/streaming-programming-guide.html  Even though those paragraphs can't be in the released 2.1.0 doc source?

First, what is everyone's understanding of the answer?

The only official guidance I can find is http://www.apache.org/legal/release-policy.html#distribute-other-artifacts , which suggests that docs need to be released similarly, with signatures. Not quite the same question, but strongly implies they're treated like any other source that is released with a vote.

------

WHAT ARE THE REQUIREMENTS TO DISTRIBUTE OTHER ARTIFACTS IN ADDITION TO THE SOURCE PACKAGE?

ASF releases typically contain additional material together with the source package. This material may include documentation concerning the release but must contain LICENSE and NOTICE files. As mentioned above, these artifacts must be signed by a committer with a detached signature if they are to be placed in the project's distribution directory.

Again, these artifacts may be distributed only if they contain LICENSE and NOTICE files. For example, the Java artifact format is based on a compressed directory structure and those projects wishing to distribute jars must place LICENSE and NOTICE files in the META-INF directory within the jar.

Nothing in this section is meant to supersede the requirements defined here and here that all releases be primarily based on a signed source package.


On Tue, Jun 6, 2017 at 9:50 AM Nick Pentreath <[hidden email]> wrote:
The website updates for ML QA (SPARK-20507) are not actually critical as the project website certainly can be updated separately from the source code guide and is not part of the release to be voted on. In future that particular work item for the QA process could be marked down in priority, and is definitely not a release blocker.

In any event I just resolved SPARK-20507, as I don't believe any website updates are required for this release anyway. That fully resolves the ML QA umbrella (SPARK-20499).

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Felix Cheung
In reply to this post by Sean Owen
All tasks on the R QA umbrella are completed
SPARK-20512

We can close this.



_____________________________
From: Sean Owen <[hidden email]>
Sent: Tuesday, June 6, 2017 1:16 AM
Subject: Re: [VOTE] Apache Spark 2.2.0 (RC4)
To: Michael Armbrust <[hidden email]>
Cc: <[hidden email]>


On Tue, Jun 6, 2017 at 1:06 AM Michael Armbrust <[hidden email]> wrote:
Regarding the readiness of this and previous RCs.  I did cut RC1 & RC2 knowing that they were unlikely to pass.  That said, I still think these early RCs are valuable. I know several users that wanted to test new features in 2.2 that have used them.  Now, if we would prefer to call them preview or RC0 or something I'd be okay with that as well.

They are valuable, I only suggest it's better to note explicitly when there are blockers or must-do tasks that will fail the RC. It makes a big difference to whether one would like to +1.

I meant more than just calling them something different. An early RC could be voted as a released 'preview' artifact, at the start of the notional QA period, with a lower bar to passing, and releasable with known issues. This encourages more testing. It also resolves the controversy about whether it's OK to include an RC in a product (separate thread). 
 

Regarding doc updates, I don't think it is a requirement that they be voted on as part of the release.  Even if they are something version specific.  I think we have regularly updated the website with documentation that was merged after the release.

They're part of the source release too, as markdown, and should be voted on. I've never understood otherwise. Have we actually released docs and then later changed them, so that they don't match the release? I don't recall that, but I do recall updating the non-version-specific website.

Aside from the oddity of having docs generated from x.y source not match docs published for x.y, you want the same protections for doc source that the project distributes as anything else. It's not just correctness, but liability. The hypothetical is always that someone included copyrighted text or something without permission and now the project can't rely on the argument that it made a good-faith effort to review what it released on the site. Someone becomes personally liable.

These are pretty technical reasons though. More practically, what's the hurry to release if docs aren't done (_if_ they're not done)? It's being presented as normal practice, but seems quite exceptional.

 
I personally don't think the QA umbrella JIRAs are particularly effective, but I also wouldn't ban their use if others think they are.  However, I do think that real QA needs an RC to test, so I think it is fine that there is still outstanding QA to be done when an RC is cut.  For example, I plan to run a bunch of streaming workloads on RC4 and will vote accordingly.

QA on RCs is great (see above). The problem is, I can't distinguish between a JIRA that means "we must test in general", which sounds like something you too would ignore, and one that means "there is specific functionality we have to check before a release that we haven't looked at yet", which is a committer waving a flag that they implicitly do not want a release until resolved. I wouldn't +1 a release that had a Blocker software defect one of us reported. 

I know I'm harping on this, but this is the one mechanism we do use consistently (Blocker JIRAs) to clearly communicate about issues vital to a go / no-go release decision, and I think this interferes. The rest of JIRA noise doesn't matter much. You can see we're already resorting to secondary communications as a result ("anyone have any issues that need to be fixed before I cut another RC?" emails) because this is kind of ignored, and think we're swapping out a decent mechanism for worse one.

I suspect, as you do, that there's no to-do here in which case they should be resolved and we're still on track for release. I'd wait on +1 until then.



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Holden Karau
+1 pip install to local virtual env works, no local version string (was blocking the pypi upload).


On Tue, Jun 6, 2017 at 8:03 AM, Felix Cheung <[hidden email]> wrote:
All tasks on the R QA umbrella are completed
SPARK-20512

We can close this.



_____________________________
From: Sean Owen <[hidden email]>
Sent: Tuesday, June 6, 2017 1:16 AM
Subject: Re: [VOTE] Apache Spark 2.2.0 (RC4)
To: Michael Armbrust <[hidden email]>
Cc: <[hidden email]>



On Tue, Jun 6, 2017 at 1:06 AM Michael Armbrust <[hidden email]> wrote:
Regarding the readiness of this and previous RCs.  I did cut RC1 & RC2 knowing that they were unlikely to pass.  That said, I still think these early RCs are valuable. I know several users that wanted to test new features in 2.2 that have used them.  Now, if we would prefer to call them preview or RC0 or something I'd be okay with that as well.

They are valuable, I only suggest it's better to note explicitly when there are blockers or must-do tasks that will fail the RC. It makes a big difference to whether one would like to +1.

I meant more than just calling them something different. An early RC could be voted as a released 'preview' artifact, at the start of the notional QA period, with a lower bar to passing, and releasable with known issues. This encourages more testing. It also resolves the controversy about whether it's OK to include an RC in a product (separate thread). 
 

Regarding doc updates, I don't think it is a requirement that they be voted on as part of the release.  Even if they are something version specific.  I think we have regularly updated the website with documentation that was merged after the release.

They're part of the source release too, as markdown, and should be voted on. I've never understood otherwise. Have we actually released docs and then later changed them, so that they don't match the release? I don't recall that, but I do recall updating the non-version-specific website.

Aside from the oddity of having docs generated from x.y source not match docs published for x.y, you want the same protections for doc source that the project distributes as anything else. It's not just correctness, but liability. The hypothetical is always that someone included copyrighted text or something without permission and now the project can't rely on the argument that it made a good-faith effort to review what it released on the site. Someone becomes personally liable.

These are pretty technical reasons though. More practically, what's the hurry to release if docs aren't done (_if_ they're not done)? It's being presented as normal practice, but seems quite exceptional.

 
I personally don't think the QA umbrella JIRAs are particularly effective, but I also wouldn't ban their use if others think they are.  However, I do think that real QA needs an RC to test, so I think it is fine that there is still outstanding QA to be done when an RC is cut.  For example, I plan to run a bunch of streaming workloads on RC4 and will vote accordingly.

QA on RCs is great (see above). The problem is, I can't distinguish between a JIRA that means "we must test in general", which sounds like something you too would ignore, and one that means "there is specific functionality we have to check before a release that we haven't looked at yet", which is a committer waving a flag that they implicitly do not want a release until resolved. I wouldn't +1 a release that had a Blocker software defect one of us reported. 

I know I'm harping on this, but this is the one mechanism we do use consistently (Blocker JIRAs) to clearly communicate about issues vital to a go / no-go release decision, and I think this interferes. The rest of JIRA noise doesn't matter much. You can see we're already resorting to secondary communications as a result ("anyone have any issues that need to be fixed before I cut another RC?" emails) because this is kind of ignored, and think we're swapping out a decent mechanism for worse one.

I suspect, as you do, that there's no to-do here in which case they should be resolved and we're still on track for release. I'd wait on +1 until then.






--
Cell : 425-233-8271
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Dong Joon Hyun
In reply to this post by Michael Armbrust

+1 (non-binding)

 

I built and tested on CentOS 7.3.1611 / OpenJDK 1.8.131 / R 3.3.3
with “-Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftserver –Psparkr”.
Java/Scala/R tests passed as expected.

 

There are two minor things.

 

  1. For the deprecation documentation issue (https://github.com/apache/spark/pull/18207),
    I hope it goes to `Release Note` instead of blocking the current voting.

Something like `http://spark.apache.org/releases/spark-release-2-1-0.html`.

 

  1. 3rd Party test suite may fail due to the following difference
    Previously, until Spark 2.1.1, the count was ‘1’.
    It is https://issues.apache.org/jira/browse/SPARK-20954 .

 

scala> sql("create table t(a int)")

res0: org.apache.spark.sql.DataFrame = []

 

scala> sql("desc table t").show

+----------+---------+-------+

|  col_name|data_type|comment|

+----------+---------+-------+

|# col_name|data_type|comment|

|         a|      int|   null|

+----------+---------+-------+

 

scala> sql("desc table t").count

res2: Long = 2

 

Bests,

Dongjoon.

 

 

 

 

From: Michael Armbrust <[hidden email]>
Date: Monday, June 5, 2017 at 12:14 PM
To: "[hidden email]" <[hidden email]>
Subject: [VOTE] Apache Spark 2.2.0 (RC4)

 

Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

 

[ ] +1 Release this package as Apache Spark 2.2.0

[ ] -1 Do not release this package because ...

 

 

To learn more about Apache Spark, please see http://spark.apache.org/

 

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

 

List of JIRA tickets resolved can be found with this filter.

 

The release files, including signatures, digests, etc. can be found at:

 

Release artifacts are signed with the following key:

 

The staging repository for this release can be found at:

 

The documentation corresponding to this release can be found at:

 

 

FAQ

 

How can I help test this release?

 

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

 

What should happen to JIRA tickets still targeting 2.2.0?

 

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

 

But my bug isn't fixed!??!

 

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Ricardo Almeida-2
In reply to this post by Michael Armbrust
+1 (non-binding)

Built and tested with -Phadoop-2.7 -Dhadoop.version=2.7.3 -Pyarn -Phive -Phive-thriftserver -Pscala-2.11 on 
  • Ubuntu 17.04, Java 8 (OpenJDK 1.8.0_111)
  • macOS 10.12.5 Java 8 (build 1.8.0_131)

On 5 June 2017 at 21:14, Michael Armbrust <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

vaquarkhan
+1 non-binding

Regards,
vaquar khan 

On Jun 7, 2017 4:32 PM, "Ricardo Almeida" <[hidden email]> wrote:
+1 (non-binding)

Built and tested with -Phadoop-2.7 -Dhadoop.version=2.7.3 -Pyarn -Phive -Phive-thriftserver -Pscala-2.11 on 
  • Ubuntu 17.04, Java 8 (OpenJDK 1.8.0_111)
  • macOS 10.12.5 Java 8 (build 1.8.0_131)

On 5 June 2017 at 21:14, Michael Armbrust <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Denny Lee
+1 non-binding

Tested on macOS Sierra, Ubuntu 16.04
test suite includes various test cases including Spark SQL, ML, GraphFrames, Structured Streaming


On Wed, Jun 7, 2017 at 9:40 PM vaquar khan <[hidden email]> wrote:
+1 non-binding

Regards,
vaquar khan 

On Jun 7, 2017 4:32 PM, "Ricardo Almeida" <[hidden email]> wrote:
+1 (non-binding)

Built and tested with -Phadoop-2.7 -Dhadoop.version=2.7.3 -Pyarn -Phive -Phive-thriftserver -Pscala-2.11 on 
  • Ubuntu 17.04, Java 8 (OpenJDK 1.8.0_111)
  • macOS 10.12.5 Java 8 (build 1.8.0_131)

On 5 June 2017 at 21:14, Michael Armbrust <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Sean Owen
In reply to this post by Sean Owen
+1 from me. Felix et al indicated that the various "2.2" JIRAs had no further actions. I retargeted most of the other 2.2.0-targeted JIRAs that didn't seem like they're must-do. We have no Blockers and I'm not aware of any changes that must be in the 2.2 release that aren't.

These are the only remaining 2.2 issues, FYI:

SPARK-20520 R streaming tests failed on Windows
SPARK-15799 Release SparkR on CRAN
SPARK-18267 Distribute PySpark via Python Package Index (pypi)

On Tue, Jun 6, 2017 at 12:20 AM Sean Owen <[hidden email]> wrote:
On the latest Ubuntu, Java 8, with -Phive -Phadoop-2.7 -Pyarn, this passes all tests. It's looking good, pending a double-check on the outstanding JIRA questions.

All the hashes and sigs are correct.

On Mon, Jun 5, 2017 at 8:15 PM Michael Armbrust <[hidden email]> wrote:
Please vote on releasing the following candidate as Apache Spark version 2.2.0. The vote is open until Thurs, June 8th, 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.2.0-rc4 (377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)

List of JIRA tickets resolved can be found with this filter.

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


FAQ

How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.2.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.1.1.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Are release docs part of a release?

Ryan Blue
In reply to this post by Sean Owen
I've never thought that docs are strictly part of a release and can't be updated outside of one. Javadoc jars are included in releases with jars, but that's more because they are produced by the build and are tied to the source code. There is plenty of other documentation that isn't normally included in a release, like the project's web pages and wiki content. I think the expectation is for that to be continuously updated. So my interpretation is that the release artifacts in the document you're quoting from are the source code and convenience binaries. There's definitely room for interpretation here, but I don't think it would be a problem as long as we do something reasonable.

On Tue, Jun 6, 2017 at 2:15 AM, Sean Owen <[hidden email]> wrote:
That's good, but, I think we should agree on whether release docs are part of a release. It's important to reasoning about releases.

To be clear, you're suggesting that, say, right now you are OK with updating this page with a few more paragraphs? http://spark.apache.org/docs/2.1.0/streaming-programming-guide.html  Even though those paragraphs can't be in the released 2.1.0 doc source?

First, what is everyone's understanding of the answer?

The only official guidance I can find is http://www.apache.org/legal/release-policy.html#distribute-other-artifacts , which suggests that docs need to be released similarly, with signatures. Not quite the same question, but strongly implies they're treated like any other source that is released with a vote.

------

WHAT ARE THE REQUIREMENTS TO DISTRIBUTE OTHER ARTIFACTS IN ADDITION TO THE SOURCE PACKAGE?

ASF releases typically contain additional material together with the source package. This material may include documentation concerning the release but must contain LICENSE and NOTICE files. As mentioned above, these artifacts must be signed by a committer with a detached signature if they are to be placed in the project's distribution directory.

Again, these artifacts may be distributed only if they contain LICENSE and NOTICE files. For example, the Java artifact format is based on a compressed directory structure and those projects wishing to distribute jars must place LICENSE and NOTICE files in the META-INF directory within the jar.

Nothing in this section is meant to supersede the requirements defined here and here that all releases be primarily based on a signed source package.


On Tue, Jun 6, 2017 at 9:50 AM Nick Pentreath <[hidden email]> wrote:
The website updates for ML QA (SPARK-20507) are not actually critical as the project website certainly can be updated separately from the source code guide and is not part of the release to be voted on. In future that particular work item for the QA process could be marked down in priority, and is definitely not a release blocker.

In any event I just resolved SPARK-20507, as I don't believe any website updates are required for this release anyway. That fully resolves the ML QA umbrella (SPARK-20499).




--
Ryan Blue
Software Engineer
Netflix
123
Loading...