internal unit tests failing against the latest spark master

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

internal unit tests failing against the latest spark master

Koert Kuipers
hey all,
today i tried upgrading the spark version we use internally by creating a new internal release from the spark master branch. last time i did this was march 7.

with this updated spark i am seeing some serialization errors in the unit tests for our own libraries. looks like a scala reflection type that is not serializable is getting sucked into serialization for the encoder?
see below.
best,
koert

[info]   org.apache.spark.SparkException: Task not serializable
[info]   at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
[info]   at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
[info]   at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
[info]   at org.apache.spark.SparkContext.clean(SparkContext.scala:2284)
[info]   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2058)
...
[info] Serialization stack:
[info]     - object not serializable (class: scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq, value: BTS(Int,AnyVal,Any))
[info]     - field (class: scala.reflect.internal.Types$TypeRef, name: baseTypeSeqCache, type: class scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq)
[info]     - object (class scala.reflect.internal.Types$ClassNoArgsTypeRef, Int)
[info]     - field (class: org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, name: elementType$2, type: class scala.reflect.api.Types$TypeApi)
[info]     - object (class org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, <function1>)
[info]     - field (class: org.apache.spark.sql.catalyst.expressions.objects.UnresolvedMapObjects, name: function, type: interface scala.Function1)
[info]     - object (class org.apache.spark.sql.catalyst.expressions.objects.UnresolvedMapObjects, unresolvedmapobjects(<function1>, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)))
[info]     - field (class: org.apache.spark.sql.catalyst.expressions.objects.WrapOption, name: child, type: class org.apache.spark.sql.catalyst.expressions.Expression)
[info]     - object (class org.apache.spark.sql.catalyst.expressions.objects.WrapOption, wrapoption(unresolvedmapobjects(<function1>, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)), ObjectType(interface scala.collection.Seq)))
[info]     - writeObject data (class: scala.collection.immutable.List$SerializationProxy)
[info]     - object (class scala.collection.immutable.List$SerializationProxy, scala.collection.immutable.List$SerializationProxy@69040c85)
[info]     - writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
[info]     - object (class scala.collection.immutable.$colon$colon, List(wrapoption(unresolvedmapobjects(<function1>, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)), ObjectType(interface scala.collection.Seq))))
[info]     - field (class: org.apache.spark.sql.catalyst.expressions.objects.NewInstance, name: arguments, type: interface scala.collection.Seq)
[info]     - object (class org.apache.spark.sql.catalyst.expressions.objects.NewInstance, newInstance(class scala.Tuple1))
[info]     - field (class: org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, name: deserializer, type: class org.apache.spark.sql.catalyst.expressions.Expression)
[info]     - object (class org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, class[_1[0]: array<int>])
...

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: internal unit tests failing against the latest spark master

Koert Kuipers
i believe the error is related to an org.apache.spark.sql.expressions.Aggregator where the buffer type (BUF) is Array[Int]

On Wed, Apr 12, 2017 at 4:19 PM, Koert Kuipers <[hidden email]> wrote:
hey all,
today i tried upgrading the spark version we use internally by creating a new internal release from the spark master branch. last time i did this was march 7.

with this updated spark i am seeing some serialization errors in the unit tests for our own libraries. looks like a scala reflection type that is not serializable is getting sucked into serialization for the encoder?
see below.
best,
koert

[info]   org.apache.spark.SparkException: Task not serializable
[info]   at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
[info]   at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
[info]   at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
[info]   at org.apache.spark.SparkContext.clean(SparkContext.scala:2284)
[info]   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2058)
...
[info] Serialization stack:
[info]     - object not serializable (class: scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq, value: BTS(Int,AnyVal,Any))
[info]     - field (class: scala.reflect.internal.Types$TypeRef, name: baseTypeSeqCache, type: class scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq)
[info]     - object (class scala.reflect.internal.Types$ClassNoArgsTypeRef, Int)
[info]     - field (class: org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, name: elementType$2, type: class scala.reflect.api.Types$TypeApi)
[info]     - object (class org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, <function1>)
[info]     - field (class: org.apache.spark.sql.catalyst.expressions.objects.UnresolvedMapObjects, name: function, type: interface scala.Function1)
[info]     - object (class org.apache.spark.sql.catalyst.expressions.objects.UnresolvedMapObjects, unresolvedmapobjects(<function1>, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)))
[info]     - field (class: org.apache.spark.sql.catalyst.expressions.objects.WrapOption, name: child, type: class org.apache.spark.sql.catalyst.expressions.Expression)
[info]     - object (class org.apache.spark.sql.catalyst.expressions.objects.WrapOption, wrapoption(unresolvedmapobjects(<function1>, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)), ObjectType(interface scala.collection.Seq)))
[info]     - writeObject data (class: scala.collection.immutable.List$SerializationProxy)
[info]     - object (class scala.collection.immutable.List$SerializationProxy, scala.collection.immutable.List$SerializationProxy@69040c85)
[info]     - writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
[info]     - object (class scala.collection.immutable.$colon$colon, List(wrapoption(unresolvedmapobjects(<function1>, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)), ObjectType(interface scala.collection.Seq))))
[info]     - field (class: org.apache.spark.sql.catalyst.expressions.objects.NewInstance, name: arguments, type: interface scala.collection.Seq)
[info]     - object (class org.apache.spark.sql.catalyst.expressions.objects.NewInstance, newInstance(class scala.Tuple1))
[info]     - field (class: org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, name: deserializer, type: class org.apache.spark.sql.catalyst.expressions.Expression)
[info]     - object (class org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, class[_1[0]: array<int>])
...


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: internal unit tests failing against the latest spark master

Koert Kuipers
i confirmed that an Encoder[Array[Int]] is no longer serializable, and with my spark build from march 7 it was.

i believe the issue is commit 295747e59739ee8a697ac3eba485d3439e4a04c3 and i send wenchen an email about it.

On Wed, Apr 12, 2017 at 4:31 PM, Koert Kuipers <[hidden email]> wrote:
i believe the error is related to an org.apache.spark.sql.expressions.Aggregator where the buffer type (BUF) is Array[Int]

On Wed, Apr 12, 2017 at 4:19 PM, Koert Kuipers <[hidden email]> wrote:
hey all,
today i tried upgrading the spark version we use internally by creating a new internal release from the spark master branch. last time i did this was march 7.

with this updated spark i am seeing some serialization errors in the unit tests for our own libraries. looks like a scala reflection type that is not serializable is getting sucked into serialization for the encoder?
see below.
best,
koert

[info]   org.apache.spark.SparkException: Task not serializable
[info]   at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
[info]   at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
[info]   at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
[info]   at org.apache.spark.SparkContext.clean(SparkContext.scala:2284)
[info]   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2058)
...
[info] Serialization stack:
[info]     - object not serializable (class: scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq, value: BTS(Int,AnyVal,Any))
[info]     - field (class: scala.reflect.internal.Types$TypeRef, name: baseTypeSeqCache, type: class scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq)
[info]     - object (class scala.reflect.internal.Types$ClassNoArgsTypeRef, Int)
[info]     - field (class: org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, name: elementType$2, type: class scala.reflect.api.Types$TypeApi)
[info]     - object (class org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6, <function1>)
[info]     - field (class: org.apache.spark.sql.catalyst.expressions.objects.UnresolvedMapObjects, name: function, type: interface scala.Function1)
[info]     - object (class org.apache.spark.sql.catalyst.expressions.objects.UnresolvedMapObjects, unresolvedmapobjects(<function1>, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)))
[info]     - field (class: org.apache.spark.sql.catalyst.expressions.objects.WrapOption, name: child, type: class org.apache.spark.sql.catalyst.expressions.Expression)
[info]     - object (class org.apache.spark.sql.catalyst.expressions.objects.WrapOption, wrapoption(unresolvedmapobjects(<function1>, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)), ObjectType(interface scala.collection.Seq)))
[info]     - writeObject data (class: scala.collection.immutable.List$SerializationProxy)
[info]     - object (class scala.collection.immutable.List$SerializationProxy, scala.collection.immutable.List$SerializationProxy@69040c85)
[info]     - writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
[info]     - object (class scala.collection.immutable.$colon$colon, List(wrapoption(unresolvedmapobjects(<function1>, getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)), ObjectType(interface scala.collection.Seq))))
[info]     - field (class: org.apache.spark.sql.catalyst.expressions.objects.NewInstance, name: arguments, type: interface scala.collection.Seq)
[info]     - object (class org.apache.spark.sql.catalyst.expressions.objects.NewInstance, newInstance(class scala.Tuple1))
[info]     - field (class: org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, name: deserializer, type: class org.apache.spark.sql.catalyst.expressions.Expression)
[info]     - object (class org.apache.spark.sql.catalyst.encoders.ExpressionEncoder, class[_1[0]: array<int>])
...



Loading...