Pyspark Selecting columns from Dataframe by providing schema as StructType

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Pyspark Selecting columns from Dataframe by providing schema as StructType

alokt
This post has NOT been accepted by the mailing list yet.
This post was updated on .
Hi

In a scala example of reading columns from a dataframe by providing a schema is possible in this way.
object UserSchema {
  val Name = "name"
  val NameField = StructField(Name, StringType, nullable = true)

  val Age = "age"
  val AgeField = StructField(Age, IntegerType, nullable = true)

  val Address = "address"
  val AddressField = StructField(Address, IntegerType, nullable = true)
}

val TestSchema = StructType(List(Name,
    Age,
    Address))

val TestColumns = TestSchema.fieldNames.toList.map(col)

val outDF = inDF.select(TestColumns: _*)

How can the same be achieved in pyspark?

Thank you.
Loading...