DataSourceV2 hangouts sync

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

DataSourceV2 hangouts sync

Ryan Blue
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Felix Cheung
Yes please!

 

From: Ryan Blue <[hidden email]>
Sent: Thursday, October 25, 2018 1:10 PM
To: Spark Dev List
Subject: DataSourceV2 hangouts sync
 
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

John Zhuge
In reply to this post by Ryan Blue
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Li Jin
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

rxin
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Xiao Li
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Dongjoon Hyun-2
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Hyukjin Kwon
+1 !

2018년 10월 26일 (금) 오전 7:21, Dongjoon Hyun <[hidden email]>님이 작성:
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

cloud0fan
In reply to this post by Dongjoon Hyun-2
Big +1 on this!

I live in UTC+8 and I'm available from 8 am, which is 5 pm in the bay area. Hopefully we can coordinate a time that fits everyone.

Thanks
Wenchen



On Fri, Oct 26, 2018 at 7:21 AM Dongjoon Hyun <[hidden email]> wrote:
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Ryan Blue
Since not many people have replied with a time window, how about we aim for 5PM PDT? That should work for Wenchen and most people here in the bay area.

If that makes it so some people can't attend, we can do the next one earlier for people in Europe.

If we go with 5PM PDT, then what day works best for everyone?

On Thu, Oct 25, 2018 at 5:01 PM Wenchen Fan <[hidden email]> wrote:
Big +1 on this!

I live in UTC+8 and I'm available from 8 am, which is 5 pm in the bay area. Hopefully we can coordinate a time that fits everyone.

Thanks
Wenchen



On Fri, Oct 26, 2018 at 7:21 AM Dongjoon Hyun <[hidden email]> wrote:
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge


--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

cloud0fan
Friday at the bay area is Saturday at my side, it will be great if we can pick a day from Monday to Thursday.

On Fri, Oct 26, 2018 at 8:08 AM Ryan Blue <[hidden email]> wrote:
Since not many people have replied with a time window, how about we aim for 5PM PDT? That should work for Wenchen and most people here in the bay area.

If that makes it so some people can't attend, we can do the next one earlier for people in Europe.

If we go with 5PM PDT, then what day works best for everyone?

On Thu, Oct 25, 2018 at 5:01 PM Wenchen Fan <[hidden email]> wrote:
Big +1 on this!

I live in UTC+8 and I'm available from 8 am, which is 5 pm in the bay area. Hopefully we can coordinate a time that fits everyone.

Thanks
Wenchen



On Fri, Oct 26, 2018 at 7:21 AM Dongjoon Hyun <[hidden email]> wrote:
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge


--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Ryan Blue
Good point. How about Monday or Wednesday at 5PM PDT then?

Everyone, please reply to me (no need to spam the list) with which option works for you and I'll send an invite for the one with the most votes.

On Thu, Oct 25, 2018 at 5:14 PM Wenchen Fan <[hidden email]> wrote:
Friday at the bay area is Saturday at my side, it will be great if we can pick a day from Monday to Thursday.

On Fri, Oct 26, 2018 at 8:08 AM Ryan Blue <[hidden email]> wrote:
Since not many people have replied with a time window, how about we aim for 5PM PDT? That should work for Wenchen and most people here in the bay area.

If that makes it so some people can't attend, we can do the next one earlier for people in Europe.

If we go with 5PM PDT, then what day works best for everyone?

On Thu, Oct 25, 2018 at 5:01 PM Wenchen Fan <[hidden email]> wrote:
Big +1 on this!

I live in UTC+8 and I'm available from 8 am, which is 5 pm in the bay area. Hopefully we can coordinate a time that fits everyone.

Thanks
Wenchen



On Fri, Oct 26, 2018 at 7:21 AM Dongjoon Hyun <[hidden email]> wrote:
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge


--
Ryan Blue
Software Engineer
Netflix


--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Hyukjin Kwon
I didn't know I live in the same timezone with you Wenchen :D.
Monday or Wednesday at 5PM PDT sounds good to me too FWIW. 

2018년 10월 26일 (금) 오전 8:29, Ryan Blue <[hidden email]>님이 작성:
Good point. How about Monday or Wednesday at 5PM PDT then?

Everyone, please reply to me (no need to spam the list) with which option works for you and I'll send an invite for the one with the most votes.

On Thu, Oct 25, 2018 at 5:14 PM Wenchen Fan <[hidden email]> wrote:
Friday at the bay area is Saturday at my side, it will be great if we can pick a day from Monday to Thursday.

On Fri, Oct 26, 2018 at 8:08 AM Ryan Blue <[hidden email]> wrote:
Since not many people have replied with a time window, how about we aim for 5PM PDT? That should work for Wenchen and most people here in the bay area.

If that makes it so some people can't attend, we can do the next one earlier for people in Europe.

If we go with 5PM PDT, then what day works best for everyone?

On Thu, Oct 25, 2018 at 5:01 PM Wenchen Fan <[hidden email]> wrote:
Big +1 on this!

I live in UTC+8 and I'm available from 8 am, which is 5 pm in the bay area. Hopefully we can coordinate a time that fits everyone.

Thanks
Wenchen



On Fri, Oct 26, 2018 at 7:21 AM Dongjoon Hyun <[hidden email]> wrote:
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge


--
Ryan Blue
Software Engineer
Netflix


--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Saikat Kanjilal
In reply to this post by cloud0fan
Ditto, I’d also like to join and am in Seattle, generally afternoons work better for me.

Sent from my iPhone

On Oct 25, 2018, at 5:02 PM, Wenchen Fan <[hidden email]> wrote:

Big +1 on this!

I live in UTC+8 and I'm available from 8 am, which is 5 pm in the bay area. Hopefully we can coordinate a time that fits everyone.

Thanks
Wenchen



On Fri, Oct 26, 2018 at 7:21 AM Dongjoon Hyun <[hidden email]> wrote:
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Gengliang
In reply to this post by Hyukjin Kwon
+1

On Oct 26, 2018, at 8:45 AM, Hyukjin Kwon <[hidden email]> wrote:

I didn't know I live in the same timezone with you Wenchen :D.
Monday or Wednesday at 5PM PDT sounds good to me too FWIW. 

2018년 10월 26일 (금) 오전 8:29, Ryan Blue <[hidden email]>님이 작성:
Good point. How about Monday or Wednesday at 5PM PDT then?

Everyone, please reply to me (no need to spam the list) with which option works for you and I'll send an invite for the one with the most votes.

On Thu, Oct 25, 2018 at 5:14 PM Wenchen Fan <[hidden email]> wrote:
Friday at the bay area is Saturday at my side, it will be great if we can pick a day from Monday to Thursday.

On Fri, Oct 26, 2018 at 8:08 AM Ryan Blue <[hidden email]> wrote:
Since not many people have replied with a time window, how about we aim for 5PM PDT? That should work for Wenchen and most people here in the bay area.

If that makes it so some people can't attend, we can do the next one earlier for people in Europe.

If we go with 5PM PDT, then what day works best for everyone?

On Thu, Oct 25, 2018 at 5:01 PM Wenchen Fan <[hidden email]> wrote:
Big +1 on this!

I live in UTC+8 and I'm available from 8 am, which is 5 pm in the bay area. Hopefully we can coordinate a time that fits everyone.

Thanks
Wenchen



On Fri, Oct 26, 2018 at 7:21 AM Dongjoon Hyun <[hidden email]> wrote:
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge


--
Ryan Blue
Software Engineer
Netflix


--
Ryan Blue
Software Engineer
Netflix

Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Ryan Blue
Looks like the majority opinion is for Wednesday. I've sent out an invite to everyone that replied and will add more people as I hear more responses.

Thanks, everyone!

On Fri, Oct 26, 2018 at 3:23 AM Gengliang Wang <[hidden email]> wrote:
+1

On Oct 26, 2018, at 8:45 AM, Hyukjin Kwon <[hidden email]> wrote:

I didn't know I live in the same timezone with you Wenchen :D.
Monday or Wednesday at 5PM PDT sounds good to me too FWIW. 

2018년 10월 26일 (금) 오전 8:29, Ryan Blue <[hidden email]>님이 작성:
Good point. How about Monday or Wednesday at 5PM PDT then?

Everyone, please reply to me (no need to spam the list) with which option works for you and I'll send an invite for the one with the most votes.

On Thu, Oct 25, 2018 at 5:14 PM Wenchen Fan <[hidden email]> wrote:
Friday at the bay area is Saturday at my side, it will be great if we can pick a day from Monday to Thursday.

On Fri, Oct 26, 2018 at 8:08 AM Ryan Blue <[hidden email]> wrote:
Since not many people have replied with a time window, how about we aim for 5PM PDT? That should work for Wenchen and most people here in the bay area.

If that makes it so some people can't attend, we can do the next one earlier for people in Europe.

If we go with 5PM PDT, then what day works best for everyone?

On Thu, Oct 25, 2018 at 5:01 PM Wenchen Fan <[hidden email]> wrote:
Big +1 on this!

I live in UTC+8 and I'm available from 8 am, which is 5 pm in the bay area. Hopefully we can coordinate a time that fits everyone.

Thanks
Wenchen



On Fri, Oct 26, 2018 at 7:21 AM Dongjoon Hyun <[hidden email]> wrote:
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge


--
Ryan Blue
Software Engineer
Netflix


--
Ryan Blue
Software Engineer
Netflix



--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

RussS
Responding for invite 

On Fri, Oct 26, 2018, 12:34 PM Ryan Blue <[hidden email]> wrote:
Looks like the majority opinion is for Wednesday. I've sent out an invite to everyone that replied and will add more people as I hear more responses.

Thanks, everyone!

On Fri, Oct 26, 2018 at 3:23 AM Gengliang Wang <[hidden email]> wrote:
+1

On Oct 26, 2018, at 8:45 AM, Hyukjin Kwon <[hidden email]> wrote:

I didn't know I live in the same timezone with you Wenchen :D.
Monday or Wednesday at 5PM PDT sounds good to me too FWIW. 

2018년 10월 26일 (금) 오전 8:29, Ryan Blue <[hidden email]>님이 작성:
Good point. How about Monday or Wednesday at 5PM PDT then?

Everyone, please reply to me (no need to spam the list) with which option works for you and I'll send an invite for the one with the most votes.

On Thu, Oct 25, 2018 at 5:14 PM Wenchen Fan <[hidden email]> wrote:
Friday at the bay area is Saturday at my side, it will be great if we can pick a day from Monday to Thursday.

On Fri, Oct 26, 2018 at 8:08 AM Ryan Blue <[hidden email]> wrote:
Since not many people have replied with a time window, how about we aim for 5PM PDT? That should work for Wenchen and most people here in the bay area.

If that makes it so some people can't attend, we can do the next one earlier for people in Europe.

If we go with 5PM PDT, then what day works best for everyone?

On Thu, Oct 25, 2018 at 5:01 PM Wenchen Fan <[hidden email]> wrote:
Big +1 on this!

I live in UTC+8 and I'm available from 8 am, which is 5 pm in the bay area. Hopefully we can coordinate a time that fits everyone.

Thanks
Wenchen



On Fri, Oct 26, 2018 at 7:21 AM Dongjoon Hyun <[hidden email]> wrote:
+1. Thank you for volunteering, Ryan!

Bests,
Dongjoon.


On Thu, Oct 25, 2018 at 4:19 PM Xiao Li <[hidden email]> wrote:
+1

Reynold Xin <[hidden email]> 于2018年10月25日周四 下午4:16写道:
+1



On Thu, Oct 25, 2018 at 4:12 PM Li Jin <[hidden email]> wrote:
Although I am not specifically involved in DSv2, I think having this kind of meeting is definitely helpful to discuss, move certain effort forward and keep people on the same page. Glad to see this kind of working group happening.

On Thu, Oct 25, 2018 at 5:58 PM John Zhuge <[hidden email]> wrote:
Great idea!

On Thu, Oct 25, 2018 at 1:10 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
John Zhuge


--
Ryan Blue
Software Engineer
Netflix


--
Ryan Blue
Software Engineer
Netflix



--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Ryan Blue
In reply to this post by Ryan Blue
Everyone,

There are now 25 guests invited, which is a lot of people to actively participate in a sync like this.

For those of you who probably won't actively participate, I've added a live stream. If you don't plan to talk, please use the live stream instead of the meet/hangout so that we don't end up with so many people that we can't actually get the discussion going. Here's a link to the stream:


Thanks!

rb

On Thu, Oct 25, 2018 at 1:09 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

cloud0fan
Hi all,

I spent some time thinking about the roadmap, and came up with an initial list:
SPARK-25390: data source V2 API refactoring
SPARK-24252: add catalog support
SPARK-25531: new write APIs for data source v2
SPARK-25190: better operator pushdown API
Streaming rate control API
Custom metrics API
Migrate existing data sources
Move data source v2 and built-in implementations to individual modules.


Let's have more discussion over the hangout.

Thanks,
Wenchen

On Tue, Oct 30, 2018 at 4:32 AM Ryan Blue <[hidden email]> wrote:
Everyone,

There are now 25 guests invited, which is a lot of people to actively participate in a sync like this.

For those of you who probably won't actively participate, I've added a live stream. If you don't plan to talk, please use the live stream instead of the meet/hangout so that we don't end up with so many people that we can't actually get the discussion going. Here's a link to the stream:


Thanks!

rb

On Thu, Oct 25, 2018 at 1:09 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
Ryan Blue
Software Engineer
Netflix
Reply | Threaded
Open this post in threaded view
|

Re: DataSourceV2 hangouts sync

Arun Mahadevan
Thanks for bringing up the custom metrics API in the list, its something that needs to be addressed.

A couple more items worth considering,

1. Possibility to unify the batch, micro-batch and continuous sources. (similar to SPARK-25000)
    Right now now there is significant code duplication even between micro-batch v/s continuous sources. 
    Attempt to redesign such that a single implementation could potentially work across modes (by implementing relevant apis).
2. Better framework support for supporting end-end exactly-once in streaming. (maybe framework level support for 2PC).

Thanks,
Arun


On Tue, 30 Oct 2018 at 19:24, Wenchen Fan <[hidden email]> wrote:
Hi all,

I spent some time thinking about the roadmap, and came up with an initial list:
SPARK-25390: data source V2 API refactoring
SPARK-24252: add catalog support
SPARK-25531: new write APIs for data source v2
SPARK-25190: better operator pushdown API
Streaming rate control API
Custom metrics API
Migrate existing data sources
Move data source v2 and built-in implementations to individual modules.


Let's have more discussion over the hangout.

Thanks,
Wenchen

On Tue, Oct 30, 2018 at 4:32 AM Ryan Blue <[hidden email]> wrote:
Everyone,

There are now 25 guests invited, which is a lot of people to actively participate in a sync like this.

For those of you who probably won't actively participate, I've added a live stream. If you don't plan to talk, please use the live stream instead of the meet/hangout so that we don't end up with so many people that we can't actually get the discussion going. Here's a link to the stream:


Thanks!

rb

On Thu, Oct 25, 2018 at 1:09 PM Ryan Blue <[hidden email]> wrote:
Hi everyone,

There's been some great discussion for DataSourceV2 in the last few months, but it has been difficult to resolve some of the discussions and I don't think that we have a very clear roadmap for getting the work done.

To coordinate better as a community, I'd like to start a regular sync-up over google hangouts. We use this in the Parquet community to have more effective community discussions about thorny technical issues and to get aligned on an overall roadmap. It is really helpful in that community and I think it would help us get DSv2 done more quickly.

Here's how it works: people join the hangout, we go around the list to gather topics, have about an hour-long discussion, and then send a summary of the discussion to the dev list for anyone that couldn't participate. That way we can move topics along, but we keep the broader community in the loop as well for further discussion on the mailing list.

I'll volunteer to set up the sync and send invites to anyone that wants to attend. If you're interested, please reply with the email address you'd like to put on the invite list (if there's a way to do this without specific invites, let me know). Also for the first sync, please note what times would work for you so we can try to account for people in different time zones.

For the first one, I was thinking some day next week (time TBD by those interested) and starting off with a general roadmap discussion before diving into specific technical topics.

Thanks,

rb

--
Ryan Blue
Software Engineer
Netflix


--
Ryan Blue
Software Engineer
Netflix
12