[DISCUSS] SPIP: Standardize SQL logical plans

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] SPIP: Standardize SQL logical plans

Ryan Blue
Hi everyone,

A few weeks ago, I wrote up a proposal to standardize SQL logical plans and a supporting design doc for data source catalog APIs. From the comments on those docs, it looks like we mostly have agreement around standardizing plans and around the data source catalog API.

We still need to work out details, like the transactional API extension, but I'd like to get started implementing those proposals so we have something working for the 2.4.0 release. I'm starting this thread because I think we're about ready to vote on the proposal and I'd like to get any remaining discussion going or get anyone that missed this to read through the docs.

Thanks!

rb

--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] SPIP: Standardize SQL logical plans

cloud0fan
Hi Ryan,

Great job on this! Shall we call a vote for the plan standardization SPIP? I think this is a good idea and we should do it.

Notes:
We definitely need new user-facing APIs to produce these new logical plans like DeleteData. But we need a design doc for these new APIs after the SPIP passed.
We definitely need the data source to provide the ability to create/drop/alter/lookup tables, but that belongs to the other SPIP and should be voted separately.

Thanks,
Wenchen

On Fri, Apr 20, 2018 at 5:01 AM Ryan Blue <[hidden email]> wrote:
Hi everyone,

A few weeks ago, I wrote up a proposal to standardize SQL logical plans and a supporting design doc for data source catalog APIs. From the comments on those docs, it looks like we mostly have agreement around standardizing plans and around the data source catalog API.

We still need to work out details, like the transactional API extension, but I'd like to get started implementing those proposals so we have something working for the 2.4.0 release. I'm starting this thread because I think we're about ready to vote on the proposal and I'd like to get any remaining discussion going or get anyone that missed this to read through the docs.

Thanks!

rb

--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] SPIP: Standardize SQL logical plans

Ryan Blue
Thanks! I'm all for calling a vote on the SPIP. If I understand the process correctly, the intent is for a "shepherd" to do it. I'm happy to call a vote, or feel free if you'd like to play that role.

Other comments:
* DeleteData API: I completely agree that we need to have a proposal for it. I think the SQL side is easier because DELETE FROM is already a statement. We just need to be able to identify v2 tables to use it. I'll come up with something and send a proposal to the dev list.
* Table create/drop/alter/load API: I think we have agreement around the proposed DataSourceV2 API, but we need to decide how the public API will work and how this will fit in with ExternalCatalog (see the other thread for discussion there). Do you think we need to get that entire SPIP approved before we can start getting the API in? If so, what do you think needs to be decided to get it ready?

Thanks!

rb

On Wed, Jul 11, 2018 at 8:24 PM Wenchen Fan <[hidden email]> wrote:
Hi Ryan,

Great job on this! Shall we call a vote for the plan standardization SPIP? I think this is a good idea and we should do it.

Notes:
We definitely need new user-facing APIs to produce these new logical plans like DeleteData. But we need a design doc for these new APIs after the SPIP passed.
We definitely need the data source to provide the ability to create/drop/alter/lookup tables, but that belongs to the other SPIP and should be voted separately.

Thanks,
Wenchen

On Fri, Apr 20, 2018 at 5:01 AM Ryan Blue <[hidden email]> wrote:
Hi everyone,

A few weeks ago, I wrote up a proposal to standardize SQL logical plans and a supporting design doc for data source catalog APIs. From the comments on those docs, it looks like we mostly have agreement around standardizing plans and around the data source catalog API.

We still need to work out details, like the transactional API extension, but I'd like to get started implementing those proposals so we have something working for the 2.4.0 release. I'm starting this thread because I think we're about ready to vote on the proposal and I'd like to get any remaining discussion going or get anyone that missed this to read through the docs.

Thanks!

rb

--
Ryan Blue
Software Engineer
Netflix


--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] SPIP: Standardize SQL logical plans

cloud0fan
I don't know an official answer, but conventionally people who propose the SPIP would call the vote and "shepherd" the project. Other people can jump in during the development. I'm interested in the new API and like to work on it after the vote passes.

Thanks,
Wenchen

On Fri, Jul 13, 2018 at 7:25 AM Ryan Blue <[hidden email]> wrote:
Thanks! I'm all for calling a vote on the SPIP. If I understand the process correctly, the intent is for a "shepherd" to do it. I'm happy to call a vote, or feel free if you'd like to play that role.

Other comments:
* DeleteData API: I completely agree that we need to have a proposal for it. I think the SQL side is easier because DELETE FROM is already a statement. We just need to be able to identify v2 tables to use it. I'll come up with something and send a proposal to the dev list.
* Table create/drop/alter/load API: I think we have agreement around the proposed DataSourceV2 API, but we need to decide how the public API will work and how this will fit in with ExternalCatalog (see the other thread for discussion there). Do you think we need to get that entire SPIP approved before we can start getting the API in? If so, what do you think needs to be decided to get it ready?

Thanks!

rb

On Wed, Jul 11, 2018 at 8:24 PM Wenchen Fan <[hidden email]> wrote:
Hi Ryan,

Great job on this! Shall we call a vote for the plan standardization SPIP? I think this is a good idea and we should do it.

Notes:
We definitely need new user-facing APIs to produce these new logical plans like DeleteData. But we need a design doc for these new APIs after the SPIP passed.
We definitely need the data source to provide the ability to create/drop/alter/lookup tables, but that belongs to the other SPIP and should be voted separately.

Thanks,
Wenchen

On Fri, Apr 20, 2018 at 5:01 AM Ryan Blue <[hidden email]> wrote:
Hi everyone,

A few weeks ago, I wrote up a proposal to standardize SQL logical plans and a supporting design doc for data source catalog APIs. From the comments on those docs, it looks like we mostly have agreement around standardizing plans and around the data source catalog API.

We still need to work out details, like the transactional API extension, but I'd like to get started implementing those proposals so we have something working for the 2.4.0 release. I'm starting this thread because I think we're about ready to vote on the proposal and I'd like to get any remaining discussion going or get anyone that missed this to read through the docs.

Thanks!

rb

--
Ryan Blue
Software Engineer
Netflix


--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] SPIP: Standardize SQL logical plans

Cody Koeninger-2
According to

http://spark.apache.org/improvement-proposals.html

the shepherd should be a PMC member, not necessarily the person who
proposed the SPIP

On Tue, Jul 17, 2018 at 9:13 AM, Wenchen Fan <[hidden email]> wrote:

> I don't know an official answer, but conventionally people who propose the
> SPIP would call the vote and "shepherd" the project. Other people can jump
> in during the development. I'm interested in the new API and like to work on
> it after the vote passes.
>
> Thanks,
> Wenchen
>
> On Fri, Jul 13, 2018 at 7:25 AM Ryan Blue <[hidden email]> wrote:
>>
>> Thanks! I'm all for calling a vote on the SPIP. If I understand the
>> process correctly, the intent is for a "shepherd" to do it. I'm happy to
>> call a vote, or feel free if you'd like to play that role.
>>
>> Other comments:
>> * DeleteData API: I completely agree that we need to have a proposal for
>> it. I think the SQL side is easier because DELETE FROM is already a
>> statement. We just need to be able to identify v2 tables to use it. I'll
>> come up with something and send a proposal to the dev list.
>> * Table create/drop/alter/load API: I think we have agreement around the
>> proposed DataSourceV2 API, but we need to decide how the public API will
>> work and how this will fit in with ExternalCatalog (see the other thread for
>> discussion there). Do you think we need to get that entire SPIP approved
>> before we can start getting the API in? If so, what do you think needs to be
>> decided to get it ready?
>>
>> Thanks!
>>
>> rb
>>
>> On Wed, Jul 11, 2018 at 8:24 PM Wenchen Fan <[hidden email]> wrote:
>>>
>>> Hi Ryan,
>>>
>>> Great job on this! Shall we call a vote for the plan standardization
>>> SPIP? I think this is a good idea and we should do it.
>>>
>>> Notes:
>>> We definitely need new user-facing APIs to produce these new logical
>>> plans like DeleteData. But we need a design doc for these new APIs after the
>>> SPIP passed.
>>> We definitely need the data source to provide the ability to
>>> create/drop/alter/lookup tables, but that belongs to the other SPIP and
>>> should be voted separately.
>>>
>>> Thanks,
>>> Wenchen
>>>
>>> On Fri, Apr 20, 2018 at 5:01 AM Ryan Blue <[hidden email]>
>>> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> A few weeks ago, I wrote up a proposal to standardize SQL logical plans
>>>> and a supporting design doc for data source catalog APIs. From the comments
>>>> on those docs, it looks like we mostly have agreement around standardizing
>>>> plans and around the data source catalog API.
>>>>
>>>> We still need to work out details, like the transactional API extension,
>>>> but I'd like to get started implementing those proposals so we have
>>>> something working for the 2.4.0 release. I'm starting this thread because I
>>>> think we're about ready to vote on the proposal and I'd like to get any
>>>> remaining discussion going or get anyone that missed this to read through
>>>> the docs.
>>>>
>>>> Thanks!
>>>>
>>>> rb
>>>>
>>>> --
>>>> Ryan Blue
>>>> Software Engineer
>>>> Netflix
>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] SPIP: Standardize SQL logical plans

Ryan Blue
I just called a vote on this. I don't think we really need a shepherd if there's enough interest for a vote to pass.

rb

On Tue, Jul 17, 2018 at 9:00 AM Cody Koeninger <[hidden email]> wrote:
According to

http://spark.apache.org/improvement-proposals.html

the shepherd should be a PMC member, not necessarily the person who
proposed the SPIP

On Tue, Jul 17, 2018 at 9:13 AM, Wenchen Fan <[hidden email]> wrote:
> I don't know an official answer, but conventionally people who propose the
> SPIP would call the vote and "shepherd" the project. Other people can jump
> in during the development. I'm interested in the new API and like to work on
> it after the vote passes.
>
> Thanks,
> Wenchen
>
> On Fri, Jul 13, 2018 at 7:25 AM Ryan Blue <[hidden email]> wrote:
>>
>> Thanks! I'm all for calling a vote on the SPIP. If I understand the
>> process correctly, the intent is for a "shepherd" to do it. I'm happy to
>> call a vote, or feel free if you'd like to play that role.
>>
>> Other comments:
>> * DeleteData API: I completely agree that we need to have a proposal for
>> it. I think the SQL side is easier because DELETE FROM is already a
>> statement. We just need to be able to identify v2 tables to use it. I'll
>> come up with something and send a proposal to the dev list.
>> * Table create/drop/alter/load API: I think we have agreement around the
>> proposed DataSourceV2 API, but we need to decide how the public API will
>> work and how this will fit in with ExternalCatalog (see the other thread for
>> discussion there). Do you think we need to get that entire SPIP approved
>> before we can start getting the API in? If so, what do you think needs to be
>> decided to get it ready?
>>
>> Thanks!
>>
>> rb
>>
>> On Wed, Jul 11, 2018 at 8:24 PM Wenchen Fan <[hidden email]> wrote:
>>>
>>> Hi Ryan,
>>>
>>> Great job on this! Shall we call a vote for the plan standardization
>>> SPIP? I think this is a good idea and we should do it.
>>>
>>> Notes:
>>> We definitely need new user-facing APIs to produce these new logical
>>> plans like DeleteData. But we need a design doc for these new APIs after the
>>> SPIP passed.
>>> We definitely need the data source to provide the ability to
>>> create/drop/alter/lookup tables, but that belongs to the other SPIP and
>>> should be voted separately.
>>>
>>> Thanks,
>>> Wenchen
>>>
>>> On Fri, Apr 20, 2018 at 5:01 AM Ryan Blue <[hidden email]>
>>> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> A few weeks ago, I wrote up a proposal to standardize SQL logical plans
>>>> and a supporting design doc for data source catalog APIs. From the comments
>>>> on those docs, it looks like we mostly have agreement around standardizing
>>>> plans and around the data source catalog API.
>>>>
>>>> We still need to work out details, like the transactional API extension,
>>>> but I'd like to get started implementing those proposals so we have
>>>> something working for the 2.4.0 release. I'm starting this thread because I
>>>> think we're about ready to vote on the proposal and I'd like to get any
>>>> remaining discussion going or get anyone that missed this to read through
>>>> the docs.
>>>>
>>>> Thanks!
>>>>
>>>> rb
>>>>
>>>> --
>>>> Ryan Blue
>>>> Software Engineer
>>>> Netflix
>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix


--
Ryan Blue
Software Engineer
Netflix