how to get partition column info in Data Source V2 writer

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

how to get partition column info in Data Source V2 writer

aakash aakash
Hi Spark dev folks,

First of all kudos on this new Data Source v2, API looks simple and it makes easy to develop a new data source and use it.

With my current work, I am trying to implement a new data source V2 writer with Spark 2.3 and I was wondering how I will get the info about partition by columns. I see that it has been passed to Data Source V1 from DataFrameWriter but not for V2.


Thanks,
Aakash
Reply | Threaded
Open this post in threaded view
|

Re: how to get partition column info in Data Source V2 writer

Andrew Melo
Hi Aakash

On Tue, Dec 17, 2019 at 12:42 PM aakash aakash <[hidden email]> wrote:
Hi Spark dev folks,

First of all kudos on this new Data Source v2, API looks simple and it makes easy to develop a new data source and use it.

With my current work, I am trying to implement a new data source V2 writer with Spark 2.3 and I was wondering how I will get the info about partition by columns. I see that it has been passed to Data Source V1 from DataFrameWriter but not for V2.

Not directly related to your Q, but just so you're aware, the DSv2 API evolved from 2.3->2.4 and then again for 2.4->3.0.

Cheers
Andrew
 


Thanks,
Aakash
Reply | Threaded
Open this post in threaded view
|

Re: how to get partition column info in Data Source V2 writer

aakash aakash
Thanks Andrew!

It seems there is a drastic change in 3.0, going through it.

-Aakash

On Tue, Dec 17, 2019 at 11:01 AM Andrew Melo <[hidden email]> wrote:
Hi Aakash

On Tue, Dec 17, 2019 at 12:42 PM aakash aakash <[hidden email]> wrote:
Hi Spark dev folks,

First of all kudos on this new Data Source v2, API looks simple and it makes easy to develop a new data source and use it.

With my current work, I am trying to implement a new data source V2 writer with Spark 2.3 and I was wondering how I will get the info about partition by columns. I see that it has been passed to Data Source V1 from DataFrameWriter but not for V2.

Not directly related to your Q, but just so you're aware, the DSv2 API evolved from 2.3->2.4 and then again for 2.4->3.0.

Cheers
Andrew
 


Thanks,
Aakash
Reply | Threaded
Open this post in threaded view
|

Re: how to get partition column info in Data Source V2 writer

cloud0fan
Hi Aakash,

You can try the latest DS v2 with the 3.0 preview, and the API is in a quite stable shape now. With the latest API, a Writer is created from a Table, and the Table has the partitioning information.

Thanks,
Wenchen

On Wed, Dec 18, 2019 at 3:22 AM aakash aakash <[hidden email]> wrote:
Thanks Andrew!

It seems there is a drastic change in 3.0, going through it.

-Aakash

On Tue, Dec 17, 2019 at 11:01 AM Andrew Melo <[hidden email]> wrote:
Hi Aakash

On Tue, Dec 17, 2019 at 12:42 PM aakash aakash <[hidden email]> wrote:
Hi Spark dev folks,

First of all kudos on this new Data Source v2, API looks simple and it makes easy to develop a new data source and use it.

With my current work, I am trying to implement a new data source V2 writer with Spark 2.3 and I was wondering how I will get the info about partition by columns. I see that it has been passed to Data Source V1 from DataFrameWriter but not for V2.

Not directly related to your Q, but just so you're aware, the DSv2 API evolved from 2.3->2.4 and then again for 2.4->3.0.

Cheers
Andrew
 


Thanks,
Aakash
Reply | Threaded
Open this post in threaded view
|

Re: how to get partition column info in Data Source V2 writer

aakash aakash
Thanks Wenchen!

On Wed, Dec 18, 2019 at 7:25 PM Wenchen Fan <[hidden email]> wrote:
Hi Aakash,

You can try the latest DS v2 with the 3.0 preview, and the API is in a quite stable shape now. With the latest API, a Writer is created from a Table, and the Table has the partitioning information.

Thanks,
Wenchen

On Wed, Dec 18, 2019 at 3:22 AM aakash aakash <[hidden email]> wrote:
Thanks Andrew!

It seems there is a drastic change in 3.0, going through it.

-Aakash

On Tue, Dec 17, 2019 at 11:01 AM Andrew Melo <[hidden email]> wrote:
Hi Aakash

On Tue, Dec 17, 2019 at 12:42 PM aakash aakash <[hidden email]> wrote:
Hi Spark dev folks,

First of all kudos on this new Data Source v2, API looks simple and it makes easy to develop a new data source and use it.

With my current work, I am trying to implement a new data source V2 writer with Spark 2.3 and I was wondering how I will get the info about partition by columns. I see that it has been passed to Data Source V1 from DataFrameWriter but not for V2.

Not directly related to your Q, but just so you're aware, the DSv2 API evolved from 2.3->2.4 and then again for 2.4->3.0.

Cheers
Andrew
 


Thanks,
Aakash