Datasource with ColumnBatchScan support.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Datasource with ColumnBatchScan support.

Nasrulla Khan Haris

HI Spark developers,

 

FileSourceScanExec extends ColumnarBatchScan  which internal converts columnarbatch to InternalRows, If I have a new Datasource/FileFormat which uses customrelation instead of HadoopFsRelation, Driver uses RowDataSourceScanExec, this causes castexeption from internalRow to columnarBatch. Is there a way to provide ColumnarBatchScan support to customrelation ?

 

Appreciate your inputs.

 

Thanks,

NKH

 

Reply | Threaded
Open this post in threaded view
|

Re: Datasource with ColumnBatchScan support.

cloud0fan
If you already have your own `FileFormat` implementation: just override the `supportBatch` method.

On Tue, Jun 16, 2020 at 5:39 AM Nasrulla Khan Haris <[hidden email]> wrote:

HI Spark developers,

 

FileSourceScanExec extends ColumnarBatchScan  which internal converts columnarbatch to InternalRows, If I have a new Datasource/FileFormat which uses customrelation instead of HadoopFsRelation, Driver uses RowDataSourceScanExec, this causes castexeption from internalRow to columnarBatch. Is there a way to provide ColumnarBatchScan support to customrelation ?

 

Appreciate your inputs.

 

Thanks,

NKH