Aggregated stats on data locality

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Aggregated stats on data locality

Lars Francke

I'm looking for a way to get an aggregated view of data locality across a spark job/stage.

I was sure that this existed but I can't find it now.
Basically a quick summary on how many tasks were PROCESS_LOCAL vs, NODE_LOCAL etc.

Is there a way to do this or a Jira to track (I couldn't find any but might have used the wrong search words)

Thank you!