Commit algorithms: what's required, what's delivered, and how they are managed

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Commit algorithms: what's required, what's delivered, and how they are managed

Steve Loughran


There's been discussion going on in various PRs about what committers do, are expected to do, and how they get coordinated; a general conclusion to these is "this should be covered in the developer list"

Here then, are the 3 PRs where this has surfaced.


[SPARK-22026][SQL] data source v2 write path https://github.com/apache/spark/pull/19269 

[SPARK-22078][SQL] clarify exception behaviors for all data source v2 interfaces  https://github.com/apache/spark/pull/19623

SPARK-22162] Executors and the driver should use consistent JobIDs in the RDD commit protocol : https://github.com/apache/spark/pull/19848

Right now, the Hadoop side of things is non-normatively written up in

with some errata in a WiP patch


Those docs are incomplete, and I don't know of anything equivalent covering the Spark driver's commit algorithm, so it's mostly been a matter of tracing back through the IDE and having a modified committer set to do things like fail in task or job commit.

Having spent time integrating Hadoop's forthcoming S3A committers with things, I suspect that there may be some mismatch of expectations of committers & what they deliver, but I'll need to add a bit more fault injection there to be sure. I'll have a draft of a paper up in a week or so for anyone interested in this area

-Steve