How about the fetch the shuffle data in one same machine?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How about the fetch the shuffle data in one same machine?

raintung li
Hi all,

Now Spark only think the executorId same that fetch local file, but for same IP different ExecutorId will fetch using network that actually it can be fetch in the local Or Loopback. 

Apparently fetch the local file that it is fast that can use the LVS cache. 

How do you think?

Regards
-Raintung 
Reply | Threaded
Open this post in threaded view
|

Re: How about the fetch the shuffle data in one same machine?

Saisai Shao
There is a JIRA about this thing (https://issues.apache.org/jira/browse/SPARK-6521). In the current Spark shuffle fetch still leverages Netty even two executors are on the same node, but according to the test on the JIRA, the performance is close whether to bypass network or not. From my understanding, kernel will not transfer data into NIC if it is just a loopback communication (please correct me if I'm wrong). 

On Wed, May 10, 2017 at 5:53 PM, raintung li <[hidden email]> wrote:
Hi all,

Now Spark only think the executorId same that fetch local file, but for same IP different ExecutorId will fetch using network that actually it can be fetch in the local Or Loopback. 

Apparently fetch the local file that it is fast that can use the LVS cache. 

How do you think?

Regards
-Raintung 

Reply | Threaded
Open this post in threaded view
|

Re: How about the fetch the shuffle data in one same machine?

raintung li
I don't think it is Loopback only localhost or 127.0.0.1 will go. 
And the benchmarks test code should be simple don't involve calculate.
Just make two test codes
one just read the file from local
the other just read the file from netty

Read the different file size(small -> big), should have different benchmarks. Of cause the memory copy fast than network deliver. 

On Wed, May 10, 2017 at 6:14 PM, Saisai Shao <[hidden email]> wrote:
There is a JIRA about this thing (https://issues.apache.org/jira/browse/SPARK-6521). In the current Spark shuffle fetch still leverages Netty even two executors are on the same node, but according to the test on the JIRA, the performance is close whether to bypass network or not. From my understanding, kernel will not transfer data into NIC if it is just a loopback communication (please correct me if I'm wrong). 

On Wed, May 10, 2017 at 5:53 PM, raintung li <[hidden email]> wrote:
Hi all,

Now Spark only think the executorId same that fetch local file, but for same IP different ExecutorId will fetch using network that actually it can be fetch in the local Or Loopback. 

Apparently fetch the local file that it is fast that can use the LVS cache. 

How do you think?

Regards
-Raintung