Spark3.0 gpu support

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark3.0 gpu support

cfangmac

Hi everyone

 

Recently I use the master branch of Apache Spark from github and try to use the function of GPU-aware scheduling.

 

I setup a standalone cluster and set some GPU related config optionssuch as

A) spark.worker.resourceFilewhich is followd by a json format file that contains gpu addresses;

B) spark.worker.resource.gpu.amount, which specified the gpu amount for each worker;

C) spark.executor.resource.gpu.amount, which specified the gpu amount for each executor;

D)spark.task.resource.gpu.amount, which specified the gpu request from each task;

 

Then I run a k-means training program which I thought would require many mathematical operations and gpu is thought to be helpful to accelerate the training. I got the web page as follow and it seems those gpu options are configured correctly, however I used the gpu monitor tool and found that those gpus  seems does not be used, that is to say the training program is still run on cpu other than gpu.

 

Now I am confused about two points

1, is there something I missed that caused the fail to use gpu

2, After the task is deserialized in executor, how does a jvm(Java/Scala) program run on gpu

Does the spark executor use JNI + cuda/opencl or other tools?

p4239

 

 

Thanks

Chao Fang

 



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

FW: Spark3.0 gpu support

cfangmac

 

 

 

 

 

发件人: cfangmac <[hidden email]>
日期: 2019129 星期一 下午3:49
收件人: <[hidden email]>
主题: Spark3.0 gpu support

 

Hi everyone

 

Recently I use the master branch of Apache Spark from github and try to use the function of GPU-aware scheduling.

 

I setup a standalone cluster and set some GPU related config optionssuch as

A) spark.worker.resourceFilewhich is followd by a json format file that contains gpu addresses;

B) spark.worker.resource.gpu.amount, which specified the gpu amount for each worker;

C) spark.executor.resource.gpu.amount, which specified the gpu amount for each executor;

D)spark.task.resource.gpu.amount, which specified the gpu request from each task;

 

Then I run a k-means training program which I thought would require many mathematical operations and gpu is thought to be helpful to accelerate the training. I got the web page as follow and it seems those gpu options are configured correctly, however I used the gpu monitor tool and found that those gpus  seems does not be used, that is to say the training program is still run on cpu other than gpu.

 

Now I am confused about two points

1, is there something I missed that caused the fail to use gpu

2, After the task is deserialized in executor, how does a jvm(Java/Scala) program run on gpu

Does the spark executor use JNI + cuda/opencl or other tools?

p4239

 

 

Thanks

Chao Fang

 



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Spark3.0 gpu support

Sean Owen-2
In reply to this post by cfangmac
Spark itself does not use GPUs at all. This functionality is for scheduling workloads that do. 
It does use BLAS, but unless you have a BLAS library that uses GPUs, it wouldn't cause Spark to use them.

On Mon, Dec 9, 2019 at 1:50 AM cfangmac <[hidden email]> wrote:

Hi everyone

 

Recently I use the master branch of Apache Spark from github and try to use the function of GPU-aware scheduling.

 

I setup a standalone cluster and set some GPU related config optionssuch as

A) spark.worker.resourceFilewhich is followd by a json format file that contains gpu addresses;

B) spark.worker.resource.gpu.amount, which specified the gpu amount for each worker;

C) spark.executor.resource.gpu.amount, which specified the gpu amount for each executor;

D)spark.task.resource.gpu.amount, which specified the gpu request from each task;

 

Then I run a k-means training program which I thought would require many mathematical operations and gpu is thought to be helpful to accelerate the training. I got the web page as follow and it seems those gpu options are configured correctly, however I used the gpu monitor tool and found that those gpus  seems does not be used, that is to say the training program is still run on cpu other than gpu.

 

Now I am confused about two points

1, is there something I missed that caused the fail to use gpu

2, After the task is deserialized in executor, how does a jvm(Java/Scala) program run on gpu

Does the spark executor use JNI + cuda/opencl or other tools?

p4239

 

 

Thanks

Chao Fang

 


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]