Divide Spark Dataframe to parts by timestamp

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Divide Spark Dataframe to parts by timestamp

Chetan Khatri
Hello All,

I have Spark Dataframe with timestamp from 2015-10-07 19:36:59 to 2017-01-01 18:53:23

If i want to split this Dataframe to 3 parts, I wrote below code to split it. Can anyone please confirm is this correct approach or not ?!

val finalDF1 = sampleDF.where(sampleDF.col("timestamp_col").gt("2017-01-01 23:59:59"))
      val finalDF2 = sampleDF.where(sampleDF.col("timestamp_col").lt("2017-01-02 00:00:00") and sampleDF.col("timestamp_col").gt("2016-06-31 23:59:59"))
      val finalDF3 = sampleDF.where(sampleDF.col("timestamp_col").lt("2016-06-30 00:00:00") or sampleDF.col("timestamp_col").isNull)