dataDF.write.partitionBy(“year”, “month”, “date”).mode(SaveMode.Overwrite).text(“s3://data/test2/events/”)
Hi All,While writing a partitioned data frame as partitioned text files I see that Spark deletes all available partitions while writing few new partitions.dataDF.write.partitionBy(“year”, “month”, “date”).mode(SaveMode.Overwrite).text(“s3://data/test2/events/”)Is this an expected behavior ?I have a past correction job which would overwrite couple of past partitions based on new arriving data. I would only want to remove those partitions.Is there a neater way to do that other than:- Find the partitions- Delete using Hadoop API's- Write DF in Append ModeCheersYash
If you reply to this email, your message will be added to the discussion below:
To start a new topic under Apache Spark Developers List, email [hidden email]
To unsubscribe from Apache Spark Developers List, click here.