***UNCHECKED*** [STREAMING] Improving the Checkpointing architecture

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

***UNCHECKED*** [STREAMING] Improving the Checkpointing architecture


I've been working on SPARK-23200 and a key point was raised during the
discussion, available at https://github.com/apache/spark/pull/22392

Namely, whether the checkpointing process should allow for changing
values of variables as they have been previously submitted.

For example, when you restart a streaming job, even if you provide
different properties for things like `spark.executorEnv` or a different
set of resources for the jobs in the cluster, e.g.,
`spark.kubernetes.executor.limit.cores`, the checkpointing code will
fall back to the old values of such variables, even if they were not
initially provided (thus effectively preventing the user from changing
this configuration after the job has been submitted).

Question is whether this is acceptable or if we should provide a
mechanism for overriding settings of the checkpointed configuration.


To unsubscribe e-mail: [hidden email]