ASF board report for May

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

ASF board report for May

Matei Zaharia
It’s time to submit Spark's quarterly ASF board report on May 15th, so I wanted to run the report by everyone to make sure we’re not missing something. Let me know whether I missed anything:


Apache Spark is a fast and general engine for large-scale data processing. It offers high-level APIs in Java, Scala, Python and R as well as a rich set of libraries including stream processing, machine learning, and graph analytics.

Project status:

- We released Apache Spark 2.4.1, 2.4.2 and 2.3.3 in the past three months to fix issues in the 2.3 and 2.4 branches.

- Discussions are under way about the next feature release, which will likely be Spark 3.0, on our dev and user mailing lists. Some key questions include whether to remove various deprecated APIs, and which minimum versions of Java, Python, Scala, etc to support. There are also a number of new features targeting this release. We encourage everyone in the community to give feedback on these discussions through our mailing lists or issue tracker.

- Several Spark Project Improvement Proposals (SPIPs) for major additions to Spark were discussed on the dev list in the past three months. These include support for passing columnar data efficiently into external engines (e.g. GPU based libraries), accelerator-aware scheduling, new data source APIs, and .NET support. Some of these have been accepted (e.g. table metadata and accelerator aware scheduling proposals) while others are still being discussed.


- We are continuing engagement with various organizations.

Latest releases:

- April 23rd, 2019: Spark 2.4.2
- March 31st, 2019: Spark 2.4.1
- Feb 15th, 2019: Spark 2.3.3

Committers and PMC:

- The latest committer was added on Jan 29th, 2019 (Jose Torres).
- The latest PMC member was added on Jan 12th, 2018 (Xiao Li).

To unsubscribe e-mail: [hidden email]