Straw poll: dropping support for things like Scala 2.10

classic Classic list List threaded Threaded
39 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Straw poll: dropping support for things like Scala 2.10

Sean Owen
I'd like to gauge where people stand on the issue of dropping support for a few things that were considered for 2.0.

First: Scala 2.10. We've seen a number of build breakages this week because the PR builder only tests 2.11. No big deal at this stage, but, it did cause me to wonder whether it's time to plan to drop 2.10 support, especially with 2.12 coming soon.

Next, Java 7. It's reasonably old and out of public updates at this stage. It's not that painful to keep supporting, to be honest. It would simplify some bits of code, some scripts, some testing.

Hadoop versions: I think the the general argument is that most anyone would be using, at the least, 2.6, and it would simplify some code that has to reflect to use not-even-that-new APIs. It would remove some moderate complexity in the build.


"When" is a tricky question. Although it's a little aggressive for minor releases, I think these will all happen before 3.x regardless. 2.1.0 is not out of the question, though coming soon. What about ... 2.2.0?


Although I tend to favor dropping support, I'm mostly asking for current opinions.
Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Holden Karau
I'd also like to add Python 2.6 to the list of things. We've considered dropping it before but never followed through to the best of my knowledge (although on mobile right now so can't double check).

On Tuesday, October 25, 2016, Sean Owen <[hidden email]> wrote:
I'd like to gauge where people stand on the issue of dropping support for a few things that were considered for 2.0.

First: Scala 2.10. We've seen a number of build breakages this week because the PR builder only tests 2.11. No big deal at this stage, but, it did cause me to wonder whether it's time to plan to drop 2.10 support, especially with 2.12 coming soon.

Next, Java 7. It's reasonably old and out of public updates at this stage. It's not that painful to keep supporting, to be honest. It would simplify some bits of code, some scripts, some testing.

Hadoop versions: I think the the general argument is that most anyone would be using, at the least, 2.6, and it would simplify some code that has to reflect to use not-even-that-new APIs. It would remove some moderate complexity in the build.


"When" is a tricky question. Although it's a little aggressive for minor releases, I think these will all happen before 3.x regardless. 2.1.0 is not out of the question, though coming soon. What about ... 2.2.0?


Although I tend to favor dropping support, I'm mostly asking for current opinions.


--
Cell : 425-233-8271

Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Mark Hamstra
In reply to this post by Sean Owen
What's changed since the last time we discussed these issues, about 7 months ago?  Or, another way to formulate the question: What are the threshold criteria that we should use to decide when to end Scala 2.10 and/or Java 7 support?

On Tue, Oct 25, 2016 at 8:36 AM, Sean Owen <[hidden email]> wrote:
I'd like to gauge where people stand on the issue of dropping support for a few things that were considered for 2.0.

First: Scala 2.10. We've seen a number of build breakages this week because the PR builder only tests 2.11. No big deal at this stage, but, it did cause me to wonder whether it's time to plan to drop 2.10 support, especially with 2.12 coming soon.

Next, Java 7. It's reasonably old and out of public updates at this stage. It's not that painful to keep supporting, to be honest. It would simplify some bits of code, some scripts, some testing.

Hadoop versions: I think the the general argument is that most anyone would be using, at the least, 2.6, and it would simplify some code that has to reflect to use not-even-that-new APIs. It would remove some moderate complexity in the build.


"When" is a tricky question. Although it's a little aggressive for minor releases, I think these will all happen before 3.x regardless. 2.1.0 is not out of the question, though coming soon. What about ... 2.2.0?


Although I tend to favor dropping support, I'm mostly asking for current opinions.

Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Sean Owen
The general forces are that new versions of things to support emerge, and are valuable to support, but have some cost to support in addition to old versions. And the old versions become less used and therefore less valuable to support, and at some point it tips to being more cost than value. It's hard to judge these costs and benefits.

Scala is perhaps the trickiest one because of the general mutual incompatibilities across minor versions. The cost of supporting multiple versions is high, and a third version is about to arrive. That's probably the most pressing question. It's actually biting with some regularity now, with compile errors on 2.10.

(Python I confess I don't have an informed opinion about.)

Java, Hadoop are not as urgent because they're more backwards-compatible. Anecdotally, I'd be surprised if anyone today would "upgrade" to Java 7 or an old Hadoop version. And I think that's really the question. Even if one decided to drop support for all this in 2.1.0, it would not mean people can't use Spark with these things. It merely means they can't necessarily use Spark 2.1.x. This is why we have maintenance branches for 1.6.x, 2.0.x.

Tying Scala 2.11/12 support to Java 8 might make sense.

In fact, I think that's part of the reason that an update in master, perhaps 2.1.x, could be overdue, because it actually is just the beginning of the end of the support burden. If you want to stop dealing with these in ~6 months they need to stop being supported in minor branches by right about now.




On Tue, Oct 25, 2016 at 4:47 PM Mark Hamstra <[hidden email]> wrote:
What's changed since the last time we discussed these issues, about 7 months ago?  Or, another way to formulate the question: What are the threshold criteria that we should use to decide when to end Scala 2.10 and/or Java 7 support?

On Tue, Oct 25, 2016 at 8:36 AM, Sean Owen <[hidden email]> wrote:
I'd like to gauge where people stand on the issue of dropping support for a few things that were considered for 2.0.

First: Scala 2.10. We've seen a number of build breakages this week because the PR builder only tests 2.11. No big deal at this stage, but, it did cause me to wonder whether it's time to plan to drop 2.10 support, especially with 2.12 coming soon.

Next, Java 7. It's reasonably old and out of public updates at this stage. It's not that painful to keep supporting, to be honest. It would simplify some bits of code, some scripts, some testing.

Hadoop versions: I think the the general argument is that most anyone would be using, at the least, 2.6, and it would simplify some code that has to reflect to use not-even-that-new APIs. It would remove some moderate complexity in the build.


"When" is a tricky question. Although it's a little aggressive for minor releases, I think these will all happen before 3.x regardless. 2.1.0 is not out of the question, though coming soon. What about ... 2.2.0?


Although I tend to favor dropping support, I'm mostly asking for current opinions.

Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Cody Koeninger-2
I think only supporting 1 version of scala at any given time is not
sufficient, 2 probably is ok.

I.e. don't drop 2.10 before 2.12 is out + supported

On Tue, Oct 25, 2016 at 10:56 AM, Sean Owen <[hidden email]> wrote:

> The general forces are that new versions of things to support emerge, and
> are valuable to support, but have some cost to support in addition to old
> versions. And the old versions become less used and therefore less valuable
> to support, and at some point it tips to being more cost than value. It's
> hard to judge these costs and benefits.
>
> Scala is perhaps the trickiest one because of the general mutual
> incompatibilities across minor versions. The cost of supporting multiple
> versions is high, and a third version is about to arrive. That's probably
> the most pressing question. It's actually biting with some regularity now,
> with compile errors on 2.10.
>
> (Python I confess I don't have an informed opinion about.)
>
> Java, Hadoop are not as urgent because they're more backwards-compatible.
> Anecdotally, I'd be surprised if anyone today would "upgrade" to Java 7 or
> an old Hadoop version. And I think that's really the question. Even if one
> decided to drop support for all this in 2.1.0, it would not mean people
> can't use Spark with these things. It merely means they can't necessarily
> use Spark 2.1.x. This is why we have maintenance branches for 1.6.x, 2.0.x.
>
> Tying Scala 2.11/12 support to Java 8 might make sense.
>
> In fact, I think that's part of the reason that an update in master, perhaps
> 2.1.x, could be overdue, because it actually is just the beginning of the
> end of the support burden. If you want to stop dealing with these in ~6
> months they need to stop being supported in minor branches by right about
> now.
>
>
>
>
> On Tue, Oct 25, 2016 at 4:47 PM Mark Hamstra <[hidden email]>
> wrote:
>>
>> What's changed since the last time we discussed these issues, about 7
>> months ago?  Or, another way to formulate the question: What are the
>> threshold criteria that we should use to decide when to end Scala 2.10
>> and/or Java 7 support?
>>
>> On Tue, Oct 25, 2016 at 8:36 AM, Sean Owen <[hidden email]> wrote:
>>>
>>> I'd like to gauge where people stand on the issue of dropping support for
>>> a few things that were considered for 2.0.
>>>
>>> First: Scala 2.10. We've seen a number of build breakages this week
>>> because the PR builder only tests 2.11. No big deal at this stage, but, it
>>> did cause me to wonder whether it's time to plan to drop 2.10 support,
>>> especially with 2.12 coming soon.
>>>
>>> Next, Java 7. It's reasonably old and out of public updates at this
>>> stage. It's not that painful to keep supporting, to be honest. It would
>>> simplify some bits of code, some scripts, some testing.
>>>
>>> Hadoop versions: I think the the general argument is that most anyone
>>> would be using, at the least, 2.6, and it would simplify some code that has
>>> to reflect to use not-even-that-new APIs. It would remove some moderate
>>> complexity in the build.
>>>
>>>
>>> "When" is a tricky question. Although it's a little aggressive for minor
>>> releases, I think these will all happen before 3.x regardless. 2.1.0 is not
>>> out of the question, though coming soon. What about ... 2.2.0?
>>>
>>>
>>> Although I tend to favor dropping support, I'm mostly asking for current
>>> opinions.
>>
>>
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Ofir Manor
I think that 2.1 should include a visible deprecation message about Java 7, Scala 2.10 and older Hadoop versions (plus python if there is a consensus on that), to give users / admins early warning, followed by dropping them from trunk for 2.2 once 2.1 is released.
Personally, we use only Scala 2.11 on JDK8.
Cody - Scala 2.12 will likely be released before Spark 2.1, maybe even later this week: http://scala-lang.org/news/2.12.0-RC2

Ofir Manor

Co-Founder & CTO | Equalum

Mobile: <a href="tel:%2B972-54-7801286" value="+972507470820" target="_blank">+972-54-7801286 | Email: [hidden email]


On Tue, Oct 25, 2016 at 7:28 PM, Cody Koeninger <[hidden email]> wrote:
I think only supporting 1 version of scala at any given time is not
sufficient, 2 probably is ok.

I.e. don't drop 2.10 before 2.12 is out + supported

On Tue, Oct 25, 2016 at 10:56 AM, Sean Owen <[hidden email]> wrote:
> The general forces are that new versions of things to support emerge, and
> are valuable to support, but have some cost to support in addition to old
> versions. And the old versions become less used and therefore less valuable
> to support, and at some point it tips to being more cost than value. It's
> hard to judge these costs and benefits.
>
> Scala is perhaps the trickiest one because of the general mutual
> incompatibilities across minor versions. The cost of supporting multiple
> versions is high, and a third version is about to arrive. That's probably
> the most pressing question. It's actually biting with some regularity now,
> with compile errors on 2.10.
>
> (Python I confess I don't have an informed opinion about.)
>
> Java, Hadoop are not as urgent because they're more backwards-compatible.
> Anecdotally, I'd be surprised if anyone today would "upgrade" to Java 7 or
> an old Hadoop version. And I think that's really the question. Even if one
> decided to drop support for all this in 2.1.0, it would not mean people
> can't use Spark with these things. It merely means they can't necessarily
> use Spark 2.1.x. This is why we have maintenance branches for 1.6.x, 2.0.x.
>
> Tying Scala 2.11/12 support to Java 8 might make sense.
>
> In fact, I think that's part of the reason that an update in master, perhaps
> 2.1.x, could be overdue, because it actually is just the beginning of the
> end of the support burden. If you want to stop dealing with these in ~6
> months they need to stop being supported in minor branches by right about
> now.
>
>
>
>
> On Tue, Oct 25, 2016 at 4:47 PM Mark Hamstra <[hidden email]>
> wrote:
>>
>> What's changed since the last time we discussed these issues, about 7
>> months ago?  Or, another way to formulate the question: What are the
>> threshold criteria that we should use to decide when to end Scala 2.10
>> and/or Java 7 support?
>>
>> On Tue, Oct 25, 2016 at 8:36 AM, Sean Owen <[hidden email]> wrote:
>>>
>>> I'd like to gauge where people stand on the issue of dropping support for
>>> a few things that were considered for 2.0.
>>>
>>> First: Scala 2.10. We've seen a number of build breakages this week
>>> because the PR builder only tests 2.11. No big deal at this stage, but, it
>>> did cause me to wonder whether it's time to plan to drop 2.10 support,
>>> especially with 2.12 coming soon.
>>>
>>> Next, Java 7. It's reasonably old and out of public updates at this
>>> stage. It's not that painful to keep supporting, to be honest. It would
>>> simplify some bits of code, some scripts, some testing.
>>>
>>> Hadoop versions: I think the the general argument is that most anyone
>>> would be using, at the least, 2.6, and it would simplify some code that has
>>> to reflect to use not-even-that-new APIs. It would remove some moderate
>>> complexity in the build.
>>>
>>>
>>> "When" is a tricky question. Although it's a little aggressive for minor
>>> releases, I think these will all happen before 3.x regardless. 2.1.0 is not
>>> out of the question, though coming soon. What about ... 2.2.0?
>>>
>>>
>>> Although I tend to favor dropping support, I'm mostly asking for current
>>> opinions.
>>
>>
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Daniel Siegmann-2
After support is dropped for Java 7, can we have encoders for java.time classes (e.g. LocalDate)? If so, then please drop support for Java 7 ASAP. :-)
Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Koert Kuipers
In reply to this post by Ofir Manor
it will take time before all libraries that spark depends on are available for scala 2.12, so we are not talking spark 2.1.x and probably also not 2.2.x for scala 2.12

it technically makes sense to drop java 7 and scala 2.10 around the same time as scala 2.12 is introduced

we are still heavily dependent on java 7 (and python 2.6 if we used python but we dont). i am surprised to see new clusters installed in last few months (CDH and HDP latest versions) to still be running on java 7. even getting java 8 installed on these clusters so we can use them in yarn is often not an option. it beats me as to why this is still happening.

we do not use scala 2.10 at all anymore.

On Tue, Oct 25, 2016 at 12:31 PM, Ofir Manor <[hidden email]> wrote:
I think that 2.1 should include a visible deprecation message about Java 7, Scala 2.10 and older Hadoop versions (plus python if there is a consensus on that), to give users / admins early warning, followed by dropping them from trunk for 2.2 once 2.1 is released.
Personally, we use only Scala 2.11 on JDK8.
Cody - Scala 2.12 will likely be released before Spark 2.1, maybe even later this week: http://scala-lang.org/news/2.12.0-RC2

Ofir Manor

Co-Founder & CTO | Equalum

Mobile: <a href="tel:%2B972-54-7801286" value="+972507470820" target="_blank">+972-54-7801286 | Email: [hidden email]


On Tue, Oct 25, 2016 at 7:28 PM, Cody Koeninger <[hidden email]> wrote:
I think only supporting 1 version of scala at any given time is not
sufficient, 2 probably is ok.

I.e. don't drop 2.10 before 2.12 is out + supported

On Tue, Oct 25, 2016 at 10:56 AM, Sean Owen <[hidden email]> wrote:
> The general forces are that new versions of things to support emerge, and
> are valuable to support, but have some cost to support in addition to old
> versions. And the old versions become less used and therefore less valuable
> to support, and at some point it tips to being more cost than value. It's
> hard to judge these costs and benefits.
>
> Scala is perhaps the trickiest one because of the general mutual
> incompatibilities across minor versions. The cost of supporting multiple
> versions is high, and a third version is about to arrive. That's probably
> the most pressing question. It's actually biting with some regularity now,
> with compile errors on 2.10.
>
> (Python I confess I don't have an informed opinion about.)
>
> Java, Hadoop are not as urgent because they're more backwards-compatible.
> Anecdotally, I'd be surprised if anyone today would "upgrade" to Java 7 or
> an old Hadoop version. And I think that's really the question. Even if one
> decided to drop support for all this in 2.1.0, it would not mean people
> can't use Spark with these things. It merely means they can't necessarily
> use Spark 2.1.x. This is why we have maintenance branches for 1.6.x, 2.0.x.
>
> Tying Scala 2.11/12 support to Java 8 might make sense.
>
> In fact, I think that's part of the reason that an update in master, perhaps
> 2.1.x, could be overdue, because it actually is just the beginning of the
> end of the support burden. If you want to stop dealing with these in ~6
> months they need to stop being supported in minor branches by right about
> now.
>
>
>
>
> On Tue, Oct 25, 2016 at 4:47 PM Mark Hamstra <[hidden email]>
> wrote:
>>
>> What's changed since the last time we discussed these issues, about 7
>> months ago?  Or, another way to formulate the question: What are the
>> threshold criteria that we should use to decide when to end Scala 2.10
>> and/or Java 7 support?
>>
>> On Tue, Oct 25, 2016 at 8:36 AM, Sean Owen <[hidden email]> wrote:
>>>
>>> I'd like to gauge where people stand on the issue of dropping support for
>>> a few things that were considered for 2.0.
>>>
>>> First: Scala 2.10. We've seen a number of build breakages this week
>>> because the PR builder only tests 2.11. No big deal at this stage, but, it
>>> did cause me to wonder whether it's time to plan to drop 2.10 support,
>>> especially with 2.12 coming soon.
>>>
>>> Next, Java 7. It's reasonably old and out of public updates at this
>>> stage. It's not that painful to keep supporting, to be honest. It would
>>> simplify some bits of code, some scripts, some testing.
>>>
>>> Hadoop versions: I think the the general argument is that most anyone
>>> would be using, at the least, 2.6, and it would simplify some code that has
>>> to reflect to use not-even-that-new APIs. It would remove some moderate
>>> complexity in the build.
>>>
>>>
>>> "When" is a tricky question. Although it's a little aggressive for minor
>>> releases, I think these will all happen before 3.x regardless. 2.1.0 is not
>>> out of the question, though coming soon. What about ... 2.2.0?
>>>
>>>
>>> Although I tend to favor dropping support, I'm mostly asking for current
>>> opinions.
>>
>>
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Nicholas Chammas
In reply to this post by Holden Karau
FYI: Support for both Python 2.6 and Java 7 was deprecated in 2.0 (see release notes under Deprecations). The deprecation notice didn't offer a specific timeline for completely dropping support other than to say they "might be removed in future versions of Spark 2.x".

Not sure what the distinction between deprecating and dropping support is for language versions, since in both cases it seems like it's OK to do things not compatible with the deprecated versions.

Nick


On Tue, Oct 25, 2016 at 11:50 AM Holden Karau <[hidden email]> wrote:
I'd also like to add Python 2.6 to the list of things. We've considered dropping it before but never followed through to the best of my knowledge (although on mobile right now so can't double check).

On Tuesday, October 25, 2016, Sean Owen <[hidden email]> wrote:
I'd like to gauge where people stand on the issue of dropping support for a few things that were considered for 2.0.

First: Scala 2.10. We've seen a number of build breakages this week because the PR builder only tests 2.11. No big deal at this stage, but, it did cause me to wonder whether it's time to plan to drop 2.10 support, especially with 2.12 coming soon.

Next, Java 7. It's reasonably old and out of public updates at this stage. It's not that painful to keep supporting, to be honest. It would simplify some bits of code, some scripts, some testing.

Hadoop versions: I think the the general argument is that most anyone would be using, at the least, 2.6, and it would simplify some code that has to reflect to use not-even-that-new APIs. It would remove some moderate complexity in the build.


"When" is a tricky question. Although it's a little aggressive for minor releases, I think these will all happen before 3.x regardless. 2.1.0 is not out of the question, though coming soon. What about ... 2.2.0?


Although I tend to favor dropping support, I'm mostly asking for current opinions.


--
Cell : <a href="tel:(425)%20233-8271" value="+14252338271" class="gmail_msg" target="_blank">425-233-8271

Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Mark Hamstra
No, I think our intent is that using a deprecated language version can generate warnings, but that it should still work; whereas once we remove support for a language version, then it really is ok for Spark developers to do things not compatible with that version and for users attempting to use that version to encounter errors.

With that understanding, the first steps toward removing support for Scala 2.10 and/or Java 7 would be to deprecate them in 2.1.0.  Actual removal of support could then occur at the earliest in 2.2.0. 

On Tue, Oct 25, 2016 at 12:13 PM, Nicholas Chammas <[hidden email]> wrote:
FYI: Support for both Python 2.6 and Java 7 was deprecated in 2.0 (see release notes under Deprecations). The deprecation notice didn't offer a specific timeline for completely dropping support other than to say they "might be removed in future versions of Spark 2.x".

Not sure what the distinction between deprecating and dropping support is for language versions, since in both cases it seems like it's OK to do things not compatible with the deprecated versions.

Nick


On Tue, Oct 25, 2016 at 11:50 AM Holden Karau <[hidden email]> wrote:
I'd also like to add Python 2.6 to the list of things. We've considered dropping it before but never followed through to the best of my knowledge (although on mobile right now so can't double check).

On Tuesday, October 25, 2016, Sean Owen <[hidden email]> wrote:
I'd like to gauge where people stand on the issue of dropping support for a few things that were considered for 2.0.

First: Scala 2.10. We've seen a number of build breakages this week because the PR builder only tests 2.11. No big deal at this stage, but, it did cause me to wonder whether it's time to plan to drop 2.10 support, especially with 2.12 coming soon.

Next, Java 7. It's reasonably old and out of public updates at this stage. It's not that painful to keep supporting, to be honest. It would simplify some bits of code, some scripts, some testing.

Hadoop versions: I think the the general argument is that most anyone would be using, at the least, 2.6, and it would simplify some code that has to reflect to use not-even-that-new APIs. It would remove some moderate complexity in the build.


"When" is a tricky question. Although it's a little aggressive for minor releases, I think these will all happen before 3.x regardless. 2.1.0 is not out of the question, though coming soon. What about ... 2.2.0?


Although I tend to favor dropping support, I'm mostly asking for current opinions.


--
Cell : <a href="tel:(425)%20233-8271" value="+14252338271" class="m_8019608840945271024gmail_msg" target="_blank">425-233-8271


Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Nicholas Chammas

No, I think our intent is that using a deprecated language version can generate warnings, but that it should still work; whereas once we remove support for a language version, then it really is ok for Spark developers to do things not compatible with that version and for users attempting to use that version to encounter errors.

OK, understood.

With that understanding, the first steps toward removing support for Scala 2.10 and/or Java 7 would be to deprecate them in 2.1.0. Actual removal of support could then occur at the earliest in 2.2.0.

Java 7 is already deprecated per the 2.0 release notes which I linked to. Here they are again.


On Tue, Oct 25, 2016 at 3:19 PM Mark Hamstra <[hidden email]> wrote:
No, I think our intent is that using a deprecated language version can generate warnings, but that it should still work; whereas once we remove support for a language version, then it really is ok for Spark developers to do things not compatible with that version and for users attempting to use that version to encounter errors.

With that understanding, the first steps toward removing support for Scala 2.10 and/or Java 7 would be to deprecate them in 2.1.0.  Actual removal of support could then occur at the earliest in 2.2.0. 

On Tue, Oct 25, 2016 at 12:13 PM, Nicholas Chammas <[hidden email]> wrote:
FYI: Support for both Python 2.6 and Java 7 was deprecated in 2.0 (see release notes under Deprecations). The deprecation notice didn't offer a specific timeline for completely dropping support other than to say they "might be removed in future versions of Spark 2.x".

Not sure what the distinction between deprecating and dropping support is for language versions, since in both cases it seems like it's OK to do things not compatible with the deprecated versions.

Nick


On Tue, Oct 25, 2016 at 11:50 AM Holden Karau <[hidden email]> wrote:
I'd also like to add Python 2.6 to the list of things. We've considered dropping it before but never followed through to the best of my knowledge (although on mobile right now so can't double check).

On Tuesday, October 25, 2016, Sean Owen <[hidden email]> wrote:
I'd like to gauge where people stand on the issue of dropping support for a few things that were considered for 2.0.

First: Scala 2.10. We've seen a number of build breakages this week because the PR builder only tests 2.11. No big deal at this stage, but, it did cause me to wonder whether it's time to plan to drop 2.10 support, especially with 2.12 coming soon.

Next, Java 7. It's reasonably old and out of public updates at this stage. It's not that painful to keep supporting, to be honest. It would simplify some bits of code, some scripts, some testing.

Hadoop versions: I think the the general argument is that most anyone would be using, at the least, 2.6, and it would simplify some code that has to reflect to use not-even-that-new APIs. It would remove some moderate complexity in the build.


"When" is a tricky question. Although it's a little aggressive for minor releases, I think these will all happen before 3.x regardless. 2.1.0 is not out of the question, though coming soon. What about ... 2.2.0?


Although I tend to favor dropping support, I'm mostly asking for current opinions.


--
Cell : <a href="tel:(425)%20233-8271" value="+14252338271" class="m_5683125459083664012m_8019608840945271024gmail_msg gmail_msg" target="_blank">425-233-8271


Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Mark Hamstra
You're right; so we could remove Java 7 support in 2.1.0.

Both Holden and I not having the facts immediately to mind does suggest, however, that we should be doing a better job of making sure that information about deprecated language versions is inescapably public.  That's harder to do with a language version deprecation since using such a version doesn't really give you the same kind of repeated warnings that using a deprecated API does. 

On Tue, Oct 25, 2016 at 12:59 PM, Nicholas Chammas <[hidden email]> wrote:

No, I think our intent is that using a deprecated language version can generate warnings, but that it should still work; whereas once we remove support for a language version, then it really is ok for Spark developers to do things not compatible with that version and for users attempting to use that version to encounter errors.

OK, understood.

With that understanding, the first steps toward removing support for Scala 2.10 and/or Java 7 would be to deprecate them in 2.1.0. Actual removal of support could then occur at the earliest in 2.2.0.

Java 7 is already deprecated per the 2.0 release notes which I linked to. Here they are again.


On Tue, Oct 25, 2016 at 3:19 PM Mark Hamstra <[hidden email]> wrote:
No, I think our intent is that using a deprecated language version can generate warnings, but that it should still work; whereas once we remove support for a language version, then it really is ok for Spark developers to do things not compatible with that version and for users attempting to use that version to encounter errors.

With that understanding, the first steps toward removing support for Scala 2.10 and/or Java 7 would be to deprecate them in 2.1.0.  Actual removal of support could then occur at the earliest in 2.2.0. 

On Tue, Oct 25, 2016 at 12:13 PM, Nicholas Chammas <[hidden email]> wrote:
FYI: Support for both Python 2.6 and Java 7 was deprecated in 2.0 (see release notes under Deprecations). The deprecation notice didn't offer a specific timeline for completely dropping support other than to say they "might be removed in future versions of Spark 2.x".

Not sure what the distinction between deprecating and dropping support is for language versions, since in both cases it seems like it's OK to do things not compatible with the deprecated versions.

Nick


On Tue, Oct 25, 2016 at 11:50 AM Holden Karau <[hidden email]> wrote:
I'd also like to add Python 2.6 to the list of things. We've considered dropping it before but never followed through to the best of my knowledge (although on mobile right now so can't double check).

On Tuesday, October 25, 2016, Sean Owen <[hidden email]> wrote:
I'd like to gauge where people stand on the issue of dropping support for a few things that were considered for 2.0.

First: Scala 2.10. We've seen a number of build breakages this week because the PR builder only tests 2.11. No big deal at this stage, but, it did cause me to wonder whether it's time to plan to drop 2.10 support, especially with 2.12 coming soon.

Next, Java 7. It's reasonably old and out of public updates at this stage. It's not that painful to keep supporting, to be honest. It would simplify some bits of code, some scripts, some testing.

Hadoop versions: I think the the general argument is that most anyone would be using, at the least, 2.6, and it would simplify some code that has to reflect to use not-even-that-new APIs. It would remove some moderate complexity in the build.


"When" is a tricky question. Although it's a little aggressive for minor releases, I think these will all happen before 3.x regardless. 2.1.0 is not out of the question, though coming soon. What about ... 2.2.0?


Although I tend to favor dropping support, I'm mostly asking for current opinions.


--
Cell : <a href="tel:(425)%20233-8271" value="+14252338271" class="m_5888875326632323137m_5683125459083664012m_8019608840945271024gmail_msg m_5888875326632323137gmail_msg" target="_blank">425-233-8271



Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Nicholas Chammas
Agreed. Would an announcement/reminder on the dev and user lists suffice in this case? Basically, just point out what's already been mentioned in the 2.0 release notes, and include a link there so people know what we're referencing.
2016년 10월 25일 (화) 오후 5:32, Mark Hamstra <[hidden email]>님이 작성:
You're right; so we could remove Java 7 support in 2.1.0.

Both Holden and I not having the facts immediately to mind does suggest, however, that we should be doing a better job of making sure that information about deprecated language versions is inescapably public.  That's harder to do with a language version deprecation since using such a version doesn't really give you the same kind of repeated warnings that using a deprecated API does. 

On Tue, Oct 25, 2016 at 12:59 PM, Nicholas Chammas <[hidden email]> wrote:

No, I think our intent is that using a deprecated language version can generate warnings, but that it should still work; whereas once we remove support for a language version, then it really is ok for Spark developers to do things not compatible with that version and for users attempting to use that version to encounter errors.

OK, understood.

With that understanding, the first steps toward removing support for Scala 2.10 and/or Java 7 would be to deprecate them in 2.1.0. Actual removal of support could then occur at the earliest in 2.2.0.

Java 7 is already deprecated per the 2.0 release notes which I linked to. Here they are again.


On Tue, Oct 25, 2016 at 3:19 PM Mark Hamstra <[hidden email]> wrote:
No, I think our intent is that using a deprecated language version can generate warnings, but that it should still work; whereas once we remove support for a language version, then it really is ok for Spark developers to do things not compatible with that version and for users attempting to use that version to encounter errors.

With that understanding, the first steps toward removing support for Scala 2.10 and/or Java 7 would be to deprecate them in 2.1.0.  Actual removal of support could then occur at the earliest in 2.2.0. 

On Tue, Oct 25, 2016 at 12:13 PM, Nicholas Chammas <[hidden email]> wrote:
FYI: Support for both Python 2.6 and Java 7 was deprecated in 2.0 (see release notes under Deprecations). The deprecation notice didn't offer a specific timeline for completely dropping support other than to say they "might be removed in future versions of Spark 2.x".

Not sure what the distinction between deprecating and dropping support is for language versions, since in both cases it seems like it's OK to do things not compatible with the deprecated versions.

Nick


On Tue, Oct 25, 2016 at 11:50 AM Holden Karau <[hidden email]> wrote:
I'd also like to add Python 2.6 to the list of things. We've considered dropping it before but never followed through to the best of my knowledge (although on mobile right now so can't double check).

On Tuesday, October 25, 2016, Sean Owen <[hidden email]> wrote:
I'd like to gauge where people stand on the issue of dropping support for a few things that were considered for 2.0.

First: Scala 2.10. We've seen a number of build breakages this week because the PR builder only tests 2.11. No big deal at this stage, but, it did cause me to wonder whether it's time to plan to drop 2.10 support, especially with 2.12 coming soon.

Next, Java 7. It's reasonably old and out of public updates at this stage. It's not that painful to keep supporting, to be honest. It would simplify some bits of code, some scripts, some testing.

Hadoop versions: I think the the general argument is that most anyone would be using, at the least, 2.6, and it would simplify some code that has to reflect to use not-even-that-new APIs. It would remove some moderate complexity in the build.


"When" is a tricky question. Although it's a little aggressive for minor releases, I think these will all happen before 3.x regardless. 2.1.0 is not out of the question, though coming soon. What about ... 2.2.0?


Although I tend to favor dropping support, I'm mostly asking for current opinions.


--
Cell : <a href="tel:(425)%20233-8271" value="+14252338271" class="m_5084999319223271556m_5888875326632323137m_5683125459083664012m_8019608840945271024gmail_msg m_5084999319223271556m_5888875326632323137gmail_msg gmail_msg" target="_blank">425-233-8271



Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Dongjoon Hyun
Hi, All.

It's great since it's a progress.

Then, at least, in 2017, Spark 2.2.0 will be out with JDK8 and Scala 2.11/2.12, right?

Bests,
Dongjoon.

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Daniel Siegmann-2
Is the deprecation of JDK 7 and Scala 2.10 documented anywhere outside the release notes for Spark 2.0.0? I do not consider release notes to be sufficient public notice for deprecation of supported platforms - this should be noted in the documentation somewhere. Here are on the only mentions I could find:

At http://spark.apache.org/downloads.html it says:

"Note: Starting version 2.0, Spark is built with Scala 2.11 by default. Scala 2.10 users should download the Spark source package and build with Scala 2.10 support."

At http://spark.apache.org/docs/latest/#downloading it says:

"Spark runs on Java 7+, Python 2.6+/3.4+ and R 3.1+. For the Scala API, Spark 2.0.1 uses Scala 2.11. You will need to use a compatible Scala version (2.11.x)."

At http://spark.apache.org/docs/latest/programming-guide.html#linking-with-spark it says:
  • "Spark 2.0.1 is built and distributed to work with Scala 2.11 by default. (Spark can be built to work with other versions of Scala, too.) To write applications in Scala, you will need to use a compatible Scala version (e.g. 2.11.X)."
  • "Spark 2.0.1 works with Java 7 and higher. If you are using Java 8, Spark supports lambda expressions for concisely writing functions, otherwise you can use the classes in the org.apache.spark.api.java.function package."
  • "Spark 2.0.1 works with Python 2.6+ or Python 3.4+. It can use the standard CPython interpreter, so C libraries like NumPy can be used. It also works with PyPy 2.3+."

Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Dongjoon Hyun
Hi, Daniel.

I guess that kind of works will start sufficiently in 2.1.0 after PMC's annoucement/reminder on mailing list.

Bests,
Dongjoon.


On Wednesday, October 26, 2016, Daniel Siegmann <[hidden email]> wrote:
Is the deprecation of JDK 7 and Scala 2.10 documented anywhere outside the release notes for Spark 2.0.0? I do not consider release notes to be sufficient public notice for deprecation of supported platforms - this should be noted in the documentation somewhere. Here are on the only mentions I could find:

At http://spark.apache.org/downloads.html it says:

"Note: Starting version 2.0, Spark is built with Scala 2.11 by default. Scala 2.10 users should download the Spark source package and build with Scala 2.10 support."

At http://spark.apache.org/docs/latest/#downloading it says:

"Spark runs on Java 7+, Python 2.6+/3.4+ and R 3.1+. For the Scala API, Spark 2.0.1 uses Scala 2.11. You will need to use a compatible Scala version (2.11.x)."

At http://spark.apache.org/docs/latest/programming-guide.html#linking-with-spark it says:
  • "Spark 2.0.1 is built and distributed to work with Scala 2.11 by default. (Spark can be built to work with other versions of Scala, too.) To write applications in Scala, you will need to use a compatible Scala version (e.g. 2.11.X)."
  • "Spark 2.0.1 works with Java 7 and higher. If you are using Java 8, Spark supports lambda expressions for concisely writing functions, otherwise you can use the classes in the org.apache.spark.api.java.function package."
  • "Spark 2.0.1 works with Python 2.6+ or Python 3.4+. It can use the standard CPython interpreter, so C libraries like NumPy can be used. It also works with PyPy 2.3+."

Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

rxin
We can do the following concrete proposal:

1. Plan to remove support for Java 7 / Scala 2.10 in Spark 2.2.0 (Mar/Apr 2017).

2. In Spark 2.1.0 release, aggressively and explicitly announce the deprecation of Java 7 / Scala 2.10 support.

(a) It should appear in release notes, documentations that mention how to build Spark

(b) and a warning should be shown every time SparkContext is started using Scala 2.10 or Java 7.



On Wed, Oct 26, 2016 at 7:50 PM, Dongjoon Hyun <[hidden email]> wrote:
Hi, Daniel.

I guess that kind of works will start sufficiently in 2.1.0 after PMC's annoucement/reminder on mailing list.

Bests,
Dongjoon.


On Wednesday, October 26, 2016, Daniel Siegmann <[hidden email]> wrote:
Is the deprecation of JDK 7 and Scala 2.10 documented anywhere outside the release notes for Spark 2.0.0? I do not consider release notes to be sufficient public notice for deprecation of supported platforms - this should be noted in the documentation somewhere. Here are on the only mentions I could find:

At http://spark.apache.org/downloads.html it says:

"Note: Starting version 2.0, Spark is built with Scala 2.11 by default. Scala 2.10 users should download the Spark source package and build with Scala 2.10 support."

At http://spark.apache.org/docs/latest/#downloading it says:

"Spark runs on Java 7+, Python 2.6+/3.4+ and R 3.1+. For the Scala API, Spark 2.0.1 uses Scala 2.11. You will need to use a compatible Scala version (2.11.x)."

At http://spark.apache.org/docs/latest/programming-guide.html#linking-with-spark it says:
  • "Spark 2.0.1 is built and distributed to work with Scala 2.11 by default. (Spark can be built to work with other versions of Scala, too.) To write applications in Scala, you will need to use a compatible Scala version (e.g. 2.11.X)."
  • "Spark 2.0.1 works with Java 7 and higher. If you are using Java 8, Spark supports lambda expressions for concisely writing functions, otherwise you can use the classes in the org.apache.spark.api.java.function package."
  • "Spark 2.0.1 works with Python 2.6+ or Python 3.4+. It can use the standard CPython interpreter, so C libraries like NumPy can be used. It also works with PyPy 2.3+."


Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Koert Kuipers
that sounds good to me

On Wed, Oct 26, 2016 at 2:26 PM, Reynold Xin <[hidden email]> wrote:
We can do the following concrete proposal:

1. Plan to remove support for Java 7 / Scala 2.10 in Spark 2.2.0 (Mar/Apr 2017).

2. In Spark 2.1.0 release, aggressively and explicitly announce the deprecation of Java 7 / Scala 2.10 support.

(a) It should appear in release notes, documentations that mention how to build Spark

(b) and a warning should be shown every time SparkContext is started using Scala 2.10 or Java 7.



On Wed, Oct 26, 2016 at 7:50 PM, Dongjoon Hyun <[hidden email]> wrote:
Hi, Daniel.

I guess that kind of works will start sufficiently in 2.1.0 after PMC's annoucement/reminder on mailing list.

Bests,
Dongjoon.


On Wednesday, October 26, 2016, Daniel Siegmann <[hidden email]> wrote:
Is the deprecation of JDK 7 and Scala 2.10 documented anywhere outside the release notes for Spark 2.0.0? I do not consider release notes to be sufficient public notice for deprecation of supported platforms - this should be noted in the documentation somewhere. Here are on the only mentions I could find:

At http://spark.apache.org/downloads.html it says:

"Note: Starting version 2.0, Spark is built with Scala 2.11 by default. Scala 2.10 users should download the Spark source package and build with Scala 2.10 support."

At http://spark.apache.org/docs/latest/#downloading it says:

"Spark runs on Java 7+, Python 2.6+/3.4+ and R 3.1+. For the Scala API, Spark 2.0.1 uses Scala 2.11. You will need to use a compatible Scala version (2.11.x)."

At http://spark.apache.org/docs/latest/programming-guide.html#linking-with-spark it says:
  • "Spark 2.0.1 is built and distributed to work with Scala 2.11 by default. (Spark can be built to work with other versions of Scala, too.) To write applications in Scala, you will need to use a compatible Scala version (e.g. 2.11.X)."
  • "Spark 2.0.1 works with Java 7 and higher. If you are using Java 8, Spark supports lambda expressions for concisely writing functions, otherwise you can use the classes in the org.apache.spark.api.java.function package."
  • "Spark 2.0.1 works with Python 2.6+ or Python 3.4+. It can use the standard CPython interpreter, so C libraries like NumPy can be used. It also works with PyPy 2.3+."



Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Michael Armbrust
In reply to this post by rxin
+1

On Wed, Oct 26, 2016 at 11:26 AM, Reynold Xin <[hidden email]> wrote:
We can do the following concrete proposal:

1. Plan to remove support for Java 7 / Scala 2.10 in Spark 2.2.0 (Mar/Apr 2017).

2. In Spark 2.1.0 release, aggressively and explicitly announce the deprecation of Java 7 / Scala 2.10 support.

(a) It should appear in release notes, documentations that mention how to build Spark

(b) and a warning should be shown every time SparkContext is started using Scala 2.10 or Java 7.



On Wed, Oct 26, 2016 at 7:50 PM, Dongjoon Hyun <[hidden email]> wrote:
Hi, Daniel.

I guess that kind of works will start sufficiently in 2.1.0 after PMC's annoucement/reminder on mailing list.

Bests,
Dongjoon.


On Wednesday, October 26, 2016, Daniel Siegmann <[hidden email]> wrote:
Is the deprecation of JDK 7 and Scala 2.10 documented anywhere outside the release notes for Spark 2.0.0? I do not consider release notes to be sufficient public notice for deprecation of supported platforms - this should be noted in the documentation somewhere. Here are on the only mentions I could find:

At http://spark.apache.org/downloads.html it says:

"Note: Starting version 2.0, Spark is built with Scala 2.11 by default. Scala 2.10 users should download the Spark source package and build with Scala 2.10 support."

At http://spark.apache.org/docs/latest/#downloading it says:

"Spark runs on Java 7+, Python 2.6+/3.4+ and R 3.1+. For the Scala API, Spark 2.0.1 uses Scala 2.11. You will need to use a compatible Scala version (2.11.x)."

At http://spark.apache.org/docs/latest/programming-guide.html#linking-with-spark it says:
  • "Spark 2.0.1 is built and distributed to work with Scala 2.11 by default. (Spark can be built to work with other versions of Scala, too.) To write applications in Scala, you will need to use a compatible Scala version (e.g. 2.11.X)."
  • "Spark 2.0.1 works with Java 7 and higher. If you are using Java 8, Spark supports lambda expressions for concisely writing functions, otherwise you can use the classes in the org.apache.spark.api.java.function package."
  • "Spark 2.0.1 works with Python 2.6+ or Python 3.4+. It can use the standard CPython interpreter, so C libraries like NumPy can be used. It also works with PyPy 2.3+."



Reply | Threaded
Open this post in threaded view
|

Re: Straw poll: dropping support for things like Scala 2.10

Sean Owen
In reply to this post by Koert Kuipers
Seems OK by me.
How about Hadoop < 2.6, Python 2.6? Those seem more removeable. I'd like to add that to a list of things that will begin to be unsupported 6 months from now.

On Wed, Oct 26, 2016 at 8:49 PM Koert Kuipers <[hidden email]> wrote:
that sounds good to me

On Wed, Oct 26, 2016 at 2:26 PM, Reynold Xin <[hidden email]> wrote:
We can do the following concrete proposal:

1. Plan to remove support for Java 7 / Scala 2.10 in Spark 2.2.0 (Mar/Apr 2017).

2. In Spark 2.1.0 release, aggressively and explicitly announce the deprecation of Java 7 / Scala 2.10 support.

(a) It should appear in release notes, documentations that mention how to build Spark

(b) and a warning should be shown every time SparkContext is started using Scala 2.10 or Java 7.

12