Making core Spark trun on non-IP network stack

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Making core Spark trun on non-IP network stack

Kai Backman
dev,

I would be interesting in understanding how to make core Spark run on a non
IP network stack like MPI. The main dependencies seem to be
in org.apache.spark.network but I also see some other dependencies
sprinkled in auxiliary functions.

Pointers to code, mailing list discussions or people to talk to appreciated.

Take care,

  Kai

--
Kai Backman, CEO
http://airstonelabs.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Making core Spark trun on non-IP network stack

Tathagata Das
There are two main uses of network in Spark (ignoring Spark Streaming)
through
1. spark.network stuff for bulk data transfer
2. Akka <http://akka.io/> actor library for control plane messaging

Besides porting the spark.network, you will also have to port Akka to run
on your stack. You will find most of the control layer class like
DAGScheduler, BlockManager, etc uses actor to communicate as well as to
process events in a single-threaded fashion. For example, the
BlockManagerMaster (driver) and the BlockManager (worker) communicate
(control messages only) using the BlockManagerMasterActor and
BlockManagerSlaveActor, respectively.

A good place to start would be to first read up on Akka from online
docs<http://akka.io/docs/> and
look at the code to see how we use it.

TD


On Thu, Jan 2, 2014 at 11:03 PM, Kai Backman <[hidden email]> wrote:

> dev,
>
> I would be interesting in understanding how to make core Spark run on a non
> IP network stack like MPI. The main dependencies seem to be
> in org.apache.spark.network but I also see some other dependencies
> sprinkled in auxiliary functions.
>
> Pointers to code, mailing list discussions or people to talk to
> appreciated.
>
> Take care,
>
>   Kai
>
> --
> Kai Backman, CEO
> http://airstonelabs.com
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Making core Spark trun on non-IP network stack

Matei Zaharia
Administrator
In reply to this post by Kai Backman
I think this would be a significant undertaking — my suggestion would be to make the network emulate IP (I believe that option exists for many networks) and then only optimize individual packages where it makes sense. Lots of the communication is just “control plane” (small messages) and only the block manager does more expensive transfers.

Matei

On Jan 3, 2014, at 2:03 AM, Kai Backman <[hidden email]> wrote:

> dev,
>
> I would be interesting in understanding how to make core Spark run on a non
> IP network stack like MPI. The main dependencies seem to be
> in org.apache.spark.network but I also see some other dependencies
> sprinkled in auxiliary functions.
>
> Pointers to code, mailing list discussions or people to talk to appreciated.
>
> Take care,
>
>  Kai
>
> --
> Kai Backman, CEO
> http://airstonelabs.com

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Making core Spark trun on non-IP network stack

Kai Backman
In reply to this post by Kai Backman
Thank you for your replies, this is a good start.

Take care,

  Kai


On Thu, Jan 2, 2014 at 11:03 PM, Kai Backman <[hidden email]> wrote:

> dev,
>
> I would be interesting in understanding how to make core Spark run on a
> non IP network stack like MPI. The main dependencies seem to be
> in org.apache.spark.network but I also see some other dependencies
> sprinkled in auxiliary functions.
>
> Pointers to code, mailing list discussions or people to talk to
> appreciated.
>
> Take care,
>
>   Kai
>
> --
> Kai Backman, CEO
> http://airstonelabs.com
>



--
Kai Backman, CEO
http://airstonelabs.com
Loading...