Python to Java object conversion of numpy array

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Python to Java object conversion of numpy array

Meethu Mathew-2
Hi,
I am trying to send a numpy array as an argument to a function predict()
in a class in spark/python/pyspark/mllib/clustering.py which is passed
to the function callMLlibFunc(name, *args)  in
spark/python/pyspark/mllib/common.py.

Now the value is passed to the function  _py2java(sc, obj) .Here I am
getting an exception

Py4JJavaError: An error occurred while calling z:org.apache.spark.mllib.api.python.SerDe.loads.
: net.razorvine.pickle.PickleException: expected zero arguments for construction of ClassDict (for numpy.core.multiarray._reconstruct)
        at net.razorvine.pickle.objects.ClassDictConstructor.construct(ClassDictConstructor.java:23)
        at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:617)
        at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:170)
        at net.razorvine.pickle.Unpickler.load(Unpickler.java:84)
        at net.razorvine.pickle.Unpickler.loads(Unpickler.java:97)


Why common._py2java(sc, obj) is not handling numpy array type?

Please help..


--

Regards,

*Meethu Mathew*

*Engineer*

*Flytxt*

www.flytxt.com | Visit our blog <http://blog.flytxt.com/> | Follow us
<http://www.twitter.com/flytxt> | _Connect on Linkedin
<http://www.linkedin.com/home?trk=hb_tab_home_top>_

Reply | Threaded
Open this post in threaded view
|

Re: Python to Java object conversion of numpy array

Davies Liu
Hey Meethu,

The Java API accepts only Vector, so you should convert the numpy array into
pyspark.mllib.linalg.DenseVector.

BTW, which class are you using? the KMeansModel.predict() accept numpy.array,
it will do the conversion for you.

Davies

On Fri, Jan 9, 2015 at 4:45 AM, Meethu Mathew <[hidden email]> wrote:

> Hi,
> I am trying to send a numpy array as an argument to a function predict() in
> a class in spark/python/pyspark/mllib/clustering.py which is passed to the
> function callMLlibFunc(name, *args)  in
> spark/python/pyspark/mllib/common.py.
>
> Now the value is passed to the function  _py2java(sc, obj) .Here I am
> getting an exception
>
> Py4JJavaError: An error occurred while calling
> z:org.apache.spark.mllib.api.python.SerDe.loads.
> : net.razorvine.pickle.PickleException: expected zero arguments for
> construction of ClassDict (for numpy.core.multiarray._reconstruct)
>         at
> net.razorvine.pickle.objects.ClassDictConstructor.construct(ClassDictConstructor.java:23)
>         at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:617)
>         at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:170)
>         at net.razorvine.pickle.Unpickler.load(Unpickler.java:84)
>         at net.razorvine.pickle.Unpickler.loads(Unpickler.java:97)
>
>
> Why common._py2java(sc, obj) is not handling numpy array type?
>
> Please help..
>
>
> --
>
> Regards,
>
> *Meethu Mathew*
>
> *Engineer*
>
> *Flytxt*
>
> www.flytxt.com | Visit our blog <http://blog.flytxt.com/> | Follow us
> <http://www.twitter.com/flytxt> | _Connect on Linkedin
> <http://www.linkedin.com/home?trk=hb_tab_home_top>_
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Python to Java object conversion of numpy array

Meethu Mathew-2
Hi,
Thanks Davies .

I added a new class GaussianMixtureModel in clustering.py and the method
predict in it and trying to pass numpy array from this method.I
converted it to DenseVector and its solved now.

Similarly I tried passing a List  of more than one dimension to the
function _py2java , but now the exception is

'list' object has no attribute '_get_object_id'

and when I give a tuple input (Vectors.dense([0.8786,
-0.7855]),Vectors.dense([-0.1863, 0.7799])) exception is like

'numpy.ndarray' object has no attribute '_get_object_id'

Regards,

*Meethu Mathew*

*Engineer*

*Flytxt*

www.flytxt.com | Visit our blog <http://blog.flytxt.com/> | Follow us
<http://www.twitter.com/flytxt> | _Connect on Linkedin
<http://www.linkedin.com/home?trk=hb_tab_home_top>_

On Friday 09 January 2015 11:37 PM, Davies Liu wrote:

> Hey Meethu,
>
> The Java API accepts only Vector, so you should convert the numpy array into
> pyspark.mllib.linalg.DenseVector.
>
> BTW, which class are you using? the KMeansModel.predict() accept numpy.array,
> it will do the conversion for you.
>
> Davies
>
> On Fri, Jan 9, 2015 at 4:45 AM, Meethu Mathew <[hidden email]> wrote:
>> Hi,
>> I am trying to send a numpy array as an argument to a function predict() in
>> a class in spark/python/pyspark/mllib/clustering.py which is passed to the
>> function callMLlibFunc(name, *args)  in
>> spark/python/pyspark/mllib/common.py.
>>
>> Now the value is passed to the function  _py2java(sc, obj) .Here I am
>> getting an exception
>>
>> Py4JJavaError: An error occurred while calling
>> z:org.apache.spark.mllib.api.python.SerDe.loads.
>> : net.razorvine.pickle.PickleException: expected zero arguments for
>> construction of ClassDict (for numpy.core.multiarray._reconstruct)
>>          at
>> net.razorvine.pickle.objects.ClassDictConstructor.construct(ClassDictConstructor.java:23)
>>          at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:617)
>>          at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:170)
>>          at net.razorvine.pickle.Unpickler.load(Unpickler.java:84)
>>          at net.razorvine.pickle.Unpickler.loads(Unpickler.java:97)
>>
>>
>> Why common._py2java(sc, obj) is not handling numpy array type?
>>
>> Please help..
>>
>>
>> --
>>
>> Regards,
>>
>> *Meethu Mathew*
>>
>> *Engineer*
>>
>> *Flytxt*
>>
>> www.flytxt.com | Visit our blog <http://blog.flytxt.com/> | Follow us
>> <http://www.twitter.com/flytxt> | _Connect on Linkedin
>> <http://www.linkedin.com/home?trk=hb_tab_home_top>_
>>

Reply | Threaded
Open this post in threaded view
|

Re: Python to Java object conversion of numpy array

Davies Liu
Could you post a piece of code here?

On Sun, Jan 11, 2015 at 9:28 PM, Meethu Mathew <[hidden email]> wrote:

> Hi,
> Thanks Davies .
>
> I added a new class GaussianMixtureModel in clustering.py and the method
> predict in it and trying to pass numpy array from this method.I converted it
> to DenseVector and its solved now.
>
> Similarly I tried passing a List  of more than one dimension to the function
> _py2java , but now the exception is
>
> 'list' object has no attribute '_get_object_id'
>
> and when I give a tuple input (Vectors.dense([0.8786,
> -0.7855]),Vectors.dense([-0.1863, 0.7799])) exception is like
>
> 'numpy.ndarray' object has no attribute '_get_object_id'
>
> Regards,
>
>
>
> Meethu Mathew
>
> Engineer
>
> Flytxt
>
> www.flytxt.com | Visit our blog  |  Follow us | Connect on Linkedin
>
>
>
> On Friday 09 January 2015 11:37 PM, Davies Liu wrote:
>
> Hey Meethu,
>
> The Java API accepts only Vector, so you should convert the numpy array into
> pyspark.mllib.linalg.DenseVector.
>
> BTW, which class are you using? the KMeansModel.predict() accept
> numpy.array,
> it will do the conversion for you.
>
> Davies
>
> On Fri, Jan 9, 2015 at 4:45 AM, Meethu Mathew <[hidden email]>
> wrote:
>
> Hi,
> I am trying to send a numpy array as an argument to a function predict() in
> a class in spark/python/pyspark/mllib/clustering.py which is passed to the
> function callMLlibFunc(name, *args)  in
> spark/python/pyspark/mllib/common.py.
>
> Now the value is passed to the function  _py2java(sc, obj) .Here I am
> getting an exception
>
> Py4JJavaError: An error occurred while calling
> z:org.apache.spark.mllib.api.python.SerDe.loads.
> : net.razorvine.pickle.PickleException: expected zero arguments for
> construction of ClassDict (for numpy.core.multiarray._reconstruct)
>         at
> net.razorvine.pickle.objects.ClassDictConstructor.construct(ClassDictConstructor.java:23)
>         at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:617)
>         at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:170)
>         at net.razorvine.pickle.Unpickler.load(Unpickler.java:84)
>         at net.razorvine.pickle.Unpickler.loads(Unpickler.java:97)
>
>
> Why common._py2java(sc, obj) is not handling numpy array type?
>
> Please help..
>
>
> --
>
> Regards,
>
> *Meethu Mathew*
>
> *Engineer*
>
> *Flytxt*
>
> www.flytxt.com | Visit our blog <http://blog.flytxt.com/> | Follow us
> <http://www.twitter.com/flytxt> | _Connect on Linkedin
> <http://www.linkedin.com/home?trk=hb_tab_home_top>_
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Python to Java object conversion of numpy array

Meethu Mathew-2
Hi,

This is the code I am running.

mu = (Vectors.dense([0.8786, -0.7855]),Vectors.dense([-0.1863, 0.7799]))

membershipMatrix = callMLlibFunc("findPredict",
rdd.map(_convert_to_vector), mu)

Regards,
Meethu
On Monday 12 January 2015 11:46 AM, Davies Liu wrote:

> Could you post a piece of code here?
>
> On Sun, Jan 11, 2015 at 9:28 PM, Meethu Mathew <[hidden email]> wrote:
>> Hi,
>> Thanks Davies .
>>
>> I added a new class GaussianMixtureModel in clustering.py and the method
>> predict in it and trying to pass numpy array from this method.I converted it
>> to DenseVector and its solved now.
>>
>> Similarly I tried passing a List  of more than one dimension to the function
>> _py2java , but now the exception is
>>
>> 'list' object has no attribute '_get_object_id'
>>
>> and when I give a tuple input (Vectors.dense([0.8786,
>> -0.7855]),Vectors.dense([-0.1863, 0.7799])) exception is like
>>
>> 'numpy.ndarray' object has no attribute '_get_object_id'
>>
>> Regards,
>>
>>
>>
>> Meethu Mathew
>>
>> Engineer
>>
>> Flytxt
>>
>> www.flytxt.com | Visit our blog  |  Follow us | Connect on Linkedin
>>
>>
>>
>> On Friday 09 January 2015 11:37 PM, Davies Liu wrote:
>>
>> Hey Meethu,
>>
>> The Java API accepts only Vector, so you should convert the numpy array into
>> pyspark.mllib.linalg.DenseVector.
>>
>> BTW, which class are you using? the KMeansModel.predict() accept
>> numpy.array,
>> it will do the conversion for you.
>>
>> Davies
>>
>> On Fri, Jan 9, 2015 at 4:45 AM, Meethu Mathew <[hidden email]>
>> wrote:
>>
>> Hi,
>> I am trying to send a numpy array as an argument to a function predict() in
>> a class in spark/python/pyspark/mllib/clustering.py which is passed to the
>> function callMLlibFunc(name, *args)  in
>> spark/python/pyspark/mllib/common.py.
>>
>> Now the value is passed to the function  _py2java(sc, obj) .Here I am
>> getting an exception
>>
>> Py4JJavaError: An error occurred while calling
>> z:org.apache.spark.mllib.api.python.SerDe.loads.
>> : net.razorvine.pickle.PickleException: expected zero arguments for
>> construction of ClassDict (for numpy.core.multiarray._reconstruct)
>>          at
>> net.razorvine.pickle.objects.ClassDictConstructor.construct(ClassDictConstructor.java:23)
>>          at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:617)
>>          at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:170)
>>          at net.razorvine.pickle.Unpickler.load(Unpickler.java:84)
>>          at net.razorvine.pickle.Unpickler.loads(Unpickler.java:97)
>>
>>
>> Why common._py2java(sc, obj) is not handling numpy array type?
>>
>> Please help..
>>
>>
>> --
>>
>> Regards,
>>
>> *Meethu Mathew*
>>
>> *Engineer*
>>
>> *Flytxt*
>>
>> www.flytxt.com | Visit our blog <http://blog.flytxt.com/> | Follow us
>> <http://www.twitter.com/flytxt> | _Connect on Linkedin
>> <http://www.linkedin.com/home?trk=hb_tab_home_top>_
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Python to Java object conversion of numpy array

Davies Liu
On Sun, Jan 11, 2015 at 10:21 PM, Meethu Mathew
<[hidden email]> wrote:
> Hi,
>
> This is the code I am running.
>
> mu = (Vectors.dense([0.8786, -0.7855]),Vectors.dense([-0.1863, 0.7799]))
>
> membershipMatrix = callMLlibFunc("findPredict", rdd.map(_convert_to_vector),
> mu)

What's the Java API looks like? all the arguments of findPredict
should be converted
into java objects, so what should `mu` be converted to?

> Regards,
> Meethu
> On Monday 12 January 2015 11:46 AM, Davies Liu wrote:
>
> Could you post a piece of code here?
>
> On Sun, Jan 11, 2015 at 9:28 PM, Meethu Mathew <[hidden email]>
> wrote:
>
> Hi,
> Thanks Davies .
>
> I added a new class GaussianMixtureModel in clustering.py and the method
> predict in it and trying to pass numpy array from this method.I converted it
> to DenseVector and its solved now.
>
> Similarly I tried passing a List  of more than one dimension to the function
> _py2java , but now the exception is
>
> 'list' object has no attribute '_get_object_id'
>
> and when I give a tuple input (Vectors.dense([0.8786,
> -0.7855]),Vectors.dense([-0.1863, 0.7799])) exception is like
>
> 'numpy.ndarray' object has no attribute '_get_object_id'
>
> Regards,
>
>
>
> Meethu Mathew
>
> Engineer
>
> Flytxt
>
> www.flytxt.com | Visit our blog  |  Follow us | Connect on Linkedin
>
>
>
> On Friday 09 January 2015 11:37 PM, Davies Liu wrote:
>
> Hey Meethu,
>
> The Java API accepts only Vector, so you should convert the numpy array into
> pyspark.mllib.linalg.DenseVector.
>
> BTW, which class are you using? the KMeansModel.predict() accept
> numpy.array,
> it will do the conversion for you.
>
> Davies
>
> On Fri, Jan 9, 2015 at 4:45 AM, Meethu Mathew <[hidden email]>
> wrote:
>
> Hi,
> I am trying to send a numpy array as an argument to a function predict() in
> a class in spark/python/pyspark/mllib/clustering.py which is passed to the
> function callMLlibFunc(name, *args)  in
> spark/python/pyspark/mllib/common.py.
>
> Now the value is passed to the function  _py2java(sc, obj) .Here I am
> getting an exception
>
> Py4JJavaError: An error occurred while calling
> z:org.apache.spark.mllib.api.python.SerDe.loads.
> : net.razorvine.pickle.PickleException: expected zero arguments for
> construction of ClassDict (for numpy.core.multiarray._reconstruct)
>         at
> net.razorvine.pickle.objects.ClassDictConstructor.construct(ClassDictConstructor.java:23)
>         at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:617)
>         at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:170)
>         at net.razorvine.pickle.Unpickler.load(Unpickler.java:84)
>         at net.razorvine.pickle.Unpickler.loads(Unpickler.java:97)
>
>
> Why common._py2java(sc, obj) is not handling numpy array type?
>
> Please help..
>
>
> --
>
> Regards,
>
> *Meethu Mathew*
>
> *Engineer*
>
> *Flytxt*
>
> www.flytxt.com | Visit our blog <http://blog.flytxt.com/> | Follow us
> <http://www.twitter.com/flytxt> | _Connect on Linkedin
> <http://www.linkedin.com/home?trk=hb_tab_home_top>_
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Python to Java object conversion of numpy array

Meethu Mathew-2
Hi,

This is the function defined in PythonMLLibAPI.scala
def findPredict(
       data: JavaRDD[Vector],
       wt: Object,
       mu: Array[Object],
       si: Array[Object]):  RDD[Array[Double]]  = {
}

So the parameter mu should be converted to Array[object].

mu = (Vectors.dense([0.8786, -0.7855]),Vectors.dense([-0.1863, 0.7799]))

def _py2java(sc, obj):

     if isinstance(obj, RDD):
         ...
     elif isinstance(obj, SparkContext):
       ...
     elif isinstance(obj, dict):
        ...
     elif isinstance(obj, (list, tuple)):
         obj = ListConverter().convert(obj, sc._gateway._gateway_client)
     elif isinstance(obj, JavaObject):
         pass
     elif isinstance(obj, (int, long, float, bool, basestring)):
         pass
     else:
         bytes = bytearray(PickleSerializer().dumps(obj))
         obj = sc._jvm.SerDe.loads(bytes)
     return obj

Since its a tuple of Densevectors, in _py2java() its entering the
isinstance(obj, (list, tuple)) condition and throwing exception(happens
because the dimension of tuple >1). However the conversion occurs
correctly if the Pickle conversion is done (last else part).

Hope its clear now.

Regards,
Meethu

On Monday 12 January 2015 11:35 PM, Davies Liu wrote:

> On Sun, Jan 11, 2015 at 10:21 PM, Meethu Mathew
> <[hidden email]> wrote:
>> Hi,
>>
>> This is the code I am running.
>>
>> mu = (Vectors.dense([0.8786, -0.7855]),Vectors.dense([-0.1863, 0.7799]))
>>
>> membershipMatrix = callMLlibFunc("findPredict", rdd.map(_convert_to_vector),
>> mu)
> What's the Java API looks like? all the arguments of findPredict
> should be converted
> into java objects, so what should `mu` be converted to?
>
>> Regards,
>> Meethu
>> On Monday 12 January 2015 11:46 AM, Davies Liu wrote:
>>
>> Could you post a piece of code here?
>>
>> On Sun, Jan 11, 2015 at 9:28 PM, Meethu Mathew <[hidden email]>
>> wrote:
>>
>> Hi,
>> Thanks Davies .
>>
>> I added a new class GaussianMixtureModel in clustering.py and the method
>> predict in it and trying to pass numpy array from this method.I converted it
>> to DenseVector and its solved now.
>>
>> Similarly I tried passing a List  of more than one dimension to the function
>> _py2java , but now the exception is
>>
>> 'list' object has no attribute '_get_object_id'
>>
>> and when I give a tuple input (Vectors.dense([0.8786,
>> -0.7855]),Vectors.dense([-0.1863, 0.7799])) exception is like
>>
>> 'numpy.ndarray' object has no attribute '_get_object_id'
>>
>> Regards,
>>
>>
>>
>> Meethu Mathew
>>
>> Engineer
>>
>> Flytxt
>>
>> www.flytxt.com | Visit our blog  |  Follow us | Connect on Linkedin
>>
>>
>>
>> On Friday 09 January 2015 11:37 PM, Davies Liu wrote:
>>
>> Hey Meethu,
>>
>> The Java API accepts only Vector, so you should convert the numpy array into
>> pyspark.mllib.linalg.DenseVector.
>>
>> BTW, which class are you using? the KMeansModel.predict() accept
>> numpy.array,
>> it will do the conversion for you.
>>
>> Davies
>>
>> On Fri, Jan 9, 2015 at 4:45 AM, Meethu Mathew <[hidden email]>
>> wrote:
>>
>> Hi,
>> I am trying to send a numpy array as an argument to a function predict() in
>> a class in spark/python/pyspark/mllib/clustering.py which is passed to the
>> function callMLlibFunc(name, *args)  in
>> spark/python/pyspark/mllib/common.py.
>>
>> Now the value is passed to the function  _py2java(sc, obj) .Here I am
>> getting an exception
>>
>> Py4JJavaError: An error occurred while calling
>> z:org.apache.spark.mllib.api.python.SerDe.loads.
>> : net.razorvine.pickle.PickleException: expected zero arguments for
>> construction of ClassDict (for numpy.core.multiarray._reconstruct)
>>          at
>> net.razorvine.pickle.objects.ClassDictConstructor.construct(ClassDictConstructor.java:23)
>>          at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:617)
>>          at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:170)
>>          at net.razorvine.pickle.Unpickler.load(Unpickler.java:84)
>>          at net.razorvine.pickle.Unpickler.loads(Unpickler.java:97)
>>
>>
>> Why common._py2java(sc, obj) is not handling numpy array type?
>>
>> Please help..
>>
>>
>> --
>>
>> Regards,
>>
>> *Meethu Mathew*
>>
>> *Engineer*
>>
>> *Flytxt*
>>
>> www.flytxt.com | Visit our blog <http://blog.flytxt.com/> | Follow us
>> <http://www.twitter.com/flytxt> | _Connect on Linkedin
>> <http://www.linkedin.com/home?trk=hb_tab_home_top>_
>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Python to Java object conversion of numpy array

Davies Liu
On Mon, Jan 12, 2015 at 8:14 PM, Meethu Mathew <[hidden email]> wrote:

> Hi,
>
> This is the function defined in PythonMLLibAPI.scala
> def findPredict(
>       data: JavaRDD[Vector],
>       wt: Object,
>       mu: Array[Object],
>       si: Array[Object]):  RDD[Array[Double]]  = {
> }
>
> So the parameter mu should be converted to Array[object].
>
> mu = (Vectors.dense([0.8786, -0.7855]),Vectors.dense([-0.1863, 0.7799]))
>
> def _py2java(sc, obj):
>
>     if isinstance(obj, RDD):
>         ...
>     elif isinstance(obj, SparkContext):
>       ...
>     elif isinstance(obj, dict):
>        ...
>     elif isinstance(obj, (list, tuple)):
>         obj = ListConverter().convert(obj, sc._gateway._gateway_client)
>     elif isinstance(obj, JavaObject):
>         pass
>     elif isinstance(obj, (int, long, float, bool, basestring)):
>         pass
>     else:
>         bytes = bytearray(PickleSerializer().dumps(obj))
>         obj = sc._jvm.SerDe.loads(bytes)
>     return obj
>
> Since its a tuple of Densevectors, in _py2java() its entering the
> isinstance(obj, (list, tuple)) condition and throwing exception(happens
> because the dimension of tuple >1). However the conversion occurs correctly
> if the Pickle conversion is done (last else part).

I see, we should remove the special case for list and tuple, pickle should work
more reliably for them. I had tried to remove it, it did not break any tests.

Could you do it in your PR or I create a PR for it separately?

> Hope its clear now.
>
> Regards,
> Meethu
>
> On Monday 12 January 2015 11:35 PM, Davies Liu wrote:
>
> On Sun, Jan 11, 2015 at 10:21 PM, Meethu Mathew
> <[hidden email]> wrote:
>
> Hi,
>
> This is the code I am running.
>
> mu = (Vectors.dense([0.8786, -0.7855]),Vectors.dense([-0.1863, 0.7799]))
>
> membershipMatrix = callMLlibFunc("findPredict", rdd.map(_convert_to_vector),
> mu)
>
> What's the Java API looks like? all the arguments of findPredict
> should be converted
> into java objects, so what should `mu` be converted to?
>
> Regards,
> Meethu
> On Monday 12 January 2015 11:46 AM, Davies Liu wrote:
>
> Could you post a piece of code here?
>
> On Sun, Jan 11, 2015 at 9:28 PM, Meethu Mathew <[hidden email]>
> wrote:
>
> Hi,
> Thanks Davies .
>
> I added a new class GaussianMixtureModel in clustering.py and the method
> predict in it and trying to pass numpy array from this method.I converted it
> to DenseVector and its solved now.
>
> Similarly I tried passing a List  of more than one dimension to the function
> _py2java , but now the exception is
>
> 'list' object has no attribute '_get_object_id'
>
> and when I give a tuple input (Vectors.dense([0.8786,
> -0.7855]),Vectors.dense([-0.1863, 0.7799])) exception is like
>
> 'numpy.ndarray' object has no attribute '_get_object_id'
>
> Regards,
>
>
>
> Meethu Mathew
>
> Engineer
>
> Flytxt
>
> www.flytxt.com | Visit our blog  |  Follow us | Connect on Linkedin
>
>
>
> On Friday 09 January 2015 11:37 PM, Davies Liu wrote:
>
> Hey Meethu,
>
> The Java API accepts only Vector, so you should convert the numpy array into
> pyspark.mllib.linalg.DenseVector.
>
> BTW, which class are you using? the KMeansModel.predict() accept
> numpy.array,
> it will do the conversion for you.
>
> Davies
>
> On Fri, Jan 9, 2015 at 4:45 AM, Meethu Mathew <[hidden email]>
> wrote:
>
> Hi,
> I am trying to send a numpy array as an argument to a function predict() in
> a class in spark/python/pyspark/mllib/clustering.py which is passed to the
> function callMLlibFunc(name, *args)  in
> spark/python/pyspark/mllib/common.py.
>
> Now the value is passed to the function  _py2java(sc, obj) .Here I am
> getting an exception
>
> Py4JJavaError: An error occurred while calling
> z:org.apache.spark.mllib.api.python.SerDe.loads.
> : net.razorvine.pickle.PickleException: expected zero arguments for
> construction of ClassDict (for numpy.core.multiarray._reconstruct)
>         at
> net.razorvine.pickle.objects.ClassDictConstructor.construct(ClassDictConstructor.java:23)
>         at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:617)
>         at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:170)
>         at net.razorvine.pickle.Unpickler.load(Unpickler.java:84)
>         at net.razorvine.pickle.Unpickler.loads(Unpickler.java:97)
>
>
> Why common._py2java(sc, obj) is not handling numpy array type?
>
> Please help..
>
>
> --
>
> Regards,
>
> *Meethu Mathew*
>
> *Engineer*
>
> *Flytxt*
>
> www.flytxt.com | Visit our blog <http://blog.flytxt.com/> | Follow us
> <http://www.twitter.com/flytxt> | _Connect on Linkedin
> <http://www.linkedin.com/home?trk=hb_tab_home_top>_
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]