[SPARK-25299] A Discussion About Shuffle Metadata Tracking

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[SPARK-25299] A Discussion About Shuffle Metadata Tracking

Matt Cheah-2

Hi everyone,


A working group in the community have been having ongoing discussions regarding how we can allow for flexible storage solutions for shuffle data that is compatible with containerized systems, more resilient to node failures, and can support disaggregated storage architectures.


One of the core challenges we have been trying to overcome is navigating the space of shuffle metadata tracking, and reasoning about how we approach recomputing lost shuffle blocks in the case when the shuffle file storage system is not resilient.


I have written a design document on the subject, and a proposed set of APIs to fix it. These should be considered as part of the APIs for SPARK-25299. Once we have reached some common conclusion on the proper APIs to build, I can modify the original SPARK-25299 SPIP to reflect the choices we’ve made. But I wanted to write more extensively on this topic separately to encourage focused discussion on this subset of the problem space.


You can find the design document here: https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit?usp=sharing


If you would like to catch up on the discussions we have had so far, I give some background to the subject matter and have linked to other relevant design documents and discussion threads in this document.


Feedback is definitely appreciated – I acknowledge that this is a fairly complex space with lots of potential viable options, so I’m looking forward to engaging with dialogue moving forward.




-Matt Cheah