[DESIGN] Barrier Execution Mode

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[DESIGN] Barrier Execution Mode

Jiang Xingbo
Hi All,

I would like to invite you to review the design document for Barrier Execution Mode:

TL;DR: We announced the project Hydrogen on recent Spark+AI Summit, a major part of the project involves significant changes to execution mode of Spark. This design doc proposes new APIs as well as new execution mode (known as barrier execution mode) to provide high-performance support for DL workloads.

Major changes include:
  • Add RDDBarrier to support gang scheduling.
  • Add BarrierTaskContext to support global sync of all tasks in a stage;
  • Better fault tolerance approach for barrier stage, that in case some tasks fail in the middle, retry all tasks in the same stage.
  • Integrate barrier execution mode with Standalone cluster manager.
Please feel free to review and discuss on the design proposal.

Thanks,
Xingbo

Reply | Threaded
Open this post in threaded view
|

Re: [DESIGN] Barrier Execution Mode

rxin
Xingbo,

Please reference the spip and jira ticket next time:  [SPARK-24374] SPIP: Support Barrier Scheduling in Apache Spark

On Sun, Jul 8, 2018 at 9:45 AM Xingbo Jiang <[hidden email]> wrote:
Hi All,

I would like to invite you to review the design document for Barrier Execution Mode:

TL;DR: We announced the project Hydrogen on recent Spark+AI Summit, a major part of the project involves significant changes to execution mode of Spark. This design doc proposes new APIs as well as new execution mode (known as barrier execution mode) to provide high-performance support for DL workloads.

Major changes include:
  • Add RDDBarrier to support gang scheduling.
  • Add BarrierTaskContext to support global sync of all tasks in a stage;
  • Better fault tolerance approach for barrier stage, that in case some tasks fail in the middle, retry all tasks in the same stage.
  • Integrate barrier execution mode with Standalone cluster manager.
Please feel free to review and discuss on the design proposal.

Thanks,
Xingbo