Spark-Locality: Hinting Spark location of the executor does not take effect

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Spark-Locality: Hinting Spark location of the executor does not take effect

Priyanka Gomatam

Sending on behalf of a colleague whose mail isn’t reaching the dev list for some reason 😊

 

=======================================================================================================================================================

 

HI Spark developers,

 

If I want to hint spark to use particular list of hosts to execute tasks on. I see that getBlockLocations is used to get the list of hosts from HDFS.

 

https://github.com/apache/spark/blob/7955b3962ac46b89564e0613db7bea98a1478bf2/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L386

 

 

Hinting Spark by custom getBlockLocation which return Array of BlockLocations with host ip address doesn’t help, Spark continues to host it on other executors hosts.

 

Is there something I am doing wrong ?

 

Test:

Spark.read.csv()

 

 

Appreciate your inputs 😊

 

Thanks,

Nasrulla