[DISCUSS][SPARK-23889] DataSourceV2: required sorting and clustering for writes

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS][SPARK-23889] DataSourceV2: required sorting and clustering for writes

Anton Okolnychyi-3

Hi devs,


I want to follow up on the dev list discussion [1] and the JIRA issue [2] created as a result of it and propose a slightly different approach to allow data sources to request a specific distribution and ordering of data on write.


I've put a short document [3] describing the proposed approach. It would be great to hear what the community thinks.


The SQL part of the proposal requires further discussion and any ideas are more than welcome.


Thanks,

Anton