I think this is a really important feature for spark.
First, there is already a lot of interest in alternative shuffle storage in the community. There is already a lot of interest in alternative shuffle storage, from dynamic allocation in kubernetes, to even just improving stability in standard on-premise use
of Spark. However, they're often stuck doing this in forks of Spark, and in ways that are not maintainable (because they copy-paste many spark internals) or are incorrect (for not correctly handling speculative execution & stage retries).
Second, I think the specific proposal is good for finding the right balance between flexibility and too much complexity, to allow incremental improvements. A lot of work has been put into this already to try to figure out which pieces are essential to make
alternative shuffle storage implementations feasible.
Of course, that means it doesn't include everything imaginable; some things still aren't supported, and some will still choose to use the older ShuffleManager api to give total control over all of shuffle. But we know there are a reasonable set of things which
can be implemented behind the api as the first step, and it can continue to evolve.
On Fri, Jun 14, 2019 at 12:13 PM Ilan Filonenko <[hidden email]> wrote:
+1 (non-binding). This API is versatile and flexible enough to handle Bloomberg's internal use-cases. The ability for us to vary implementation strategies
is quite appealing. It is also worth to note the minimal changes to Spark core in order to make it work. This is a very much needed addition within the Spark shuffle story.
+1 This is great work, allowing plugin of different sort shuffle write/read implementation! Also great to see it retain the current Spark configuration