Question about using collaborative filtering in MLlib

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about using collaborative filtering in MLlib

Zak H
Hi,

I'm using the Java Api for Dataframe api for Spark-Mllib. Should I be using the RDD api instead as I'm not sure if this functionality has been ported over to dataframes, correct me if I'm wrong.

My goal is to evaluate spark's recommendation capabilities. I'm looking at this example:



"public RDD<scala.Tuple2<Object,Rating[]>> recommendUsersForProducts(int num)"

For some reason the recommendProductsForUsers method isn't available in the java api:
model.recommendProductsForUsers

Is there something I'm missing here:

I've posted my code here on this gist. I am using the dataframe api for mllib. I know there may be work to port over functionality from RDD's.


Thanks,
Zak
Reply | Threaded
Open this post in threaded view
|

Re: Question about using collaborative filtering in MLlib

Yuhao Yang
Hi Zak,

Indeed the function is missing in DataFrame-based API. I can probably provide some quick prototype to see if it we can merge the function into next release. I would send update here and feel free to give it a try.

Regards,
Yuhao

2016-11-01 10:00 GMT-07:00 Zak H <[hidden email]>:
Hi,

I'm using the Java Api for Dataframe api for Spark-Mllib. Should I be using the RDD api instead as I'm not sure if this functionality has been ported over to dataframes, correct me if I'm wrong.

My goal is to evaluate spark's recommendation capabilities. I'm looking at this example:



"public RDD<scala.Tuple2<Object,Rating[]>> recommendUsersForProducts(int num)"

For some reason the recommendProductsForUsers method isn't available in the java api:
model.recommendProductsForUsers

Is there something I'm missing here:

I've posted my code here on this gist. I am using the dataframe api for mllib. I know there may be work to port over functionality from RDD's.


Thanks,
Zak

Reply | Threaded
Open this post in threaded view
|

Re: Question about using collaborative filtering in MLlib

Nick Pentreath
I have a PR for it - https://github.com/apache/spark/pull/12574

Sadly I've been tied up and haven't had a chance to work further on it.

The main issue outstanding is deciding on the transform semantics as well as performance testing.

Any comments / feedback welcome especially on transform semantics.

N