is there any tool to visualize the spark physical plan or spark plan

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

is there any tool to visualize the spark physical plan or spark plan

zhangliyun
Hi all
  i want to  ask a question is there any tool to visualize the spark physical plan or spark plan? sometimes the physical plan is very long so it is difficult to view it.

Best Regards
KellyZhang


 

Reply | Threaded
Open this post in threaded view
|

Re: is there any tool to visualize the spark physical plan or spark plan

Manu Zhang
Hi Kelly,

If you can parse event log, then try listening on `SparkListenerSQLExecutionStart` event and build a `SparkPlanGraph` like https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala#L306.

`SparkPlanGraph` has a `makeDotFile` method  where you can write out a `.dot` file and visualize it with Graphviz tools, e.g. http://www.webgraphviz.com/

Thanks,
Manu

On Thu, Apr 30, 2020 at 3:21 PM zhangliyun <[hidden email]> wrote:
Hi all
  i want to  ask a question is there any tool to visualize the spark physical plan or spark plan? sometimes the physical plan is very long so it is difficult to view it.

Best Regards
KellyZhang


 

Reply | Threaded
Open this post in threaded view
|

Re: is there any tool to visualize the spark physical plan or spark plan

cloud0fan

On Thu, Apr 30, 2020 at 5:30 PM Manu Zhang <[hidden email]> wrote:
Hi Kelly,

If you can parse event log, then try listening on `SparkListenerSQLExecutionStart` event and build a `SparkPlanGraph` like https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala#L306.

`SparkPlanGraph` has a `makeDotFile` method  where you can write out a `.dot` file and visualize it with Graphviz tools, e.g. http://www.webgraphviz.com/

Thanks,
Manu

On Thu, Apr 30, 2020 at 3:21 PM zhangliyun <[hidden email]> wrote:
Hi all
  i want to  ask a question is there any tool to visualize the spark physical plan or spark plan? sometimes the physical plan is very long so it is difficult to view it.

Best Regards
KellyZhang


 

Reply | Threaded
Open this post in threaded view
|

Re:Re: is there any tool to visualize the spark physical plan or spark plan

zhangliyun
In reply to this post by Manu Zhang


really thanks for your suggestion 




At 2020-04-30 17:30:13, "Manu Zhang" <[hidden email]> wrote:

Hi Kelly,

If you can parse event log, then try listening on `SparkListenerSQLExecutionStart` event and build a `SparkPlanGraph` like https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala#L306.

`SparkPlanGraph` has a `makeDotFile` method  where you can write out a `.dot` file and visualize it with Graphviz tools, e.g. http://www.webgraphviz.com/

Thanks,
Manu

On Thu, Apr 30, 2020 at 3:21 PM zhangliyun <[hidden email]> wrote:
Hi all
  i want to  ask a question is there any tool to visualize the spark physical plan or spark plan? sometimes the physical plan is very long so it is difficult to view it.

Best Regards
KellyZhang


 



 

Reply | Threaded
Open this post in threaded view
|

Re:Re: is there any tool to visualize the spark physical plan or spark plan

zhangliyun
In reply to this post by cloud0fan



Hi Wenchen Fan:
 thanks for reply.  in the link, i saw  sql metrics which is very userful. 
```
SQL metrics

The metrics of SQL operators are shown in the block of physical operators. The SQL metrics can be useful when we want to dive into the execution details of each operator. For example, “number of output rows” can answer how many rows are output after a Filter operator, “shuffle bytes written total” in an Exchange operator shows the number of bytes written by a shuffle.

Here is the list of SQL metrics:

````

  my question is except reading these metrics in the spark web ui., is there any way to read the metrics in driver side by code?


Best regards 

Kelly Zhang



At 2020-04-30 21:38:56, "Wenchen Fan" <[hidden email]> wrote:


On Thu, Apr 30, 2020 at 5:30 PM Manu Zhang <[hidden email]> wrote:
Hi Kelly,

If you can parse event log, then try listening on `SparkListenerSQLExecutionStart` event and build a `SparkPlanGraph` like https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala#L306.

`SparkPlanGraph` has a `makeDotFile` method  where you can write out a `.dot` file and visualize it with Graphviz tools, e.g. http://www.webgraphviz.com/

Thanks,
Manu

On Thu, Apr 30, 2020 at 3:21 PM zhangliyun <[hidden email]> wrote:
Hi all
  i want to  ask a question is there any tool to visualize the spark physical plan or spark plan? sometimes the physical plan is very long so it is difficult to view it.

Best Regards
KellyZhang


 



 

Reply | Threaded
Open this post in threaded view
|

Re: is there any tool to visualize the spark physical plan or spark plan

Enrico Minack
Kelly Zhang,

You can add a SparkListener to your spark context:

   sparkContext.addSparkListener(new SparkListener { })

That one can override onTaskEnd, which provides you a SparkListenerTaskEnd for each task. That instance provides you access to the metrics.

See:

- https://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SparkListener.html#onTaskEnd-org.apache.spark.scheduler.SparkListenerTaskEnd-
- https://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SparkListenerTaskEnd.html#taskMetrics--

Regards,
Enrico


Am 02.05.20 um 00:45 schrieb zhangliyun:



Hi Wenchen Fan:
 thanks for reply.  in the link, i saw  sql metrics which is very userful. 
```
SQL metrics

The metrics of SQL operators are shown in the block of physical operators. The SQL metrics can be useful when we want to dive into the execution details of each operator. For example, “number of output rows” can answer how many rows are output after a Filter operator, “shuffle bytes written total” in an Exchange operator shows the number of bytes written by a shuffle.

Here is the list of SQL metrics:

````

  my question is except reading these metrics in the spark web ui., is there any way to read the metrics in driver side by code?


Best regards 

Kelly Zhang



At 2020-04-30 21:38:56, "Wenchen Fan" [hidden email] wrote:


On Thu, Apr 30, 2020 at 5:30 PM Manu Zhang <[hidden email]> wrote:
Hi Kelly,

If you can parse event log, then try listening on `SparkListenerSQLExecutionStart` event and build a `SparkPlanGraph` like https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala#L306.

`SparkPlanGraph` has a `makeDotFile` method  where you can write out a `.dot` file and visualize it with Graphviz tools, e.g. http://www.webgraphviz.com/

Thanks,
Manu

On Thu, Apr 30, 2020 at 3:21 PM zhangliyun <[hidden email]> wrote:
Hi all
  i want to  ask a question is there any tool to visualize the spark physical plan or spark plan? sometimes the physical plan is very long so it is difficult to view it.

Best Regards
KellyZhang