Documentation on org.apache.spark.sql.functions backend.

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Documentation on org.apache.spark.sql.functions backend.

Vipul Rajan
I am trying to create a function that reads data from Kafka, communicates with confluent schema registry and decodes avro data with evolving schemas. I am trying to not create hack-ish patches and to write proper code that I could maybe even create pull requests for. looking at the code I have been able to figure out a few things regarding how expressions are generated and how they help to accomplish what a function does, but there is still a ton I just cannot wrap my head around. 

I am unable to find any documentation that gets into such nitty gritties of Spark. I am writing in hopes to find some help. Do you have any documentation that explains how a function (org.apache.spark.sql.function._) is turned into a logical plan?
Reply | Threaded
Open this post in threaded view
|

Re: Documentation on org.apache.spark.sql.functions backend.

Marco Gaido
Hi Vipul,

a function is never turned in a logical plan. A function is turned into an Expression. And an Expression can be part of many Logical or Physical Plans.
Hope this helps.

Thanks,
Marco

Il giorno lun 16 set 2019 alle ore 08:27 Vipul Rajan <[hidden email]> ha scritto:
I am trying to create a function that reads data from Kafka, communicates with confluent schema registry and decodes avro data with evolving schemas. I am trying to not create hack-ish patches and to write proper code that I could maybe even create pull requests for. looking at the code I have been able to figure out a few things regarding how expressions are generated and how they help to accomplish what a function does, but there is still a ton I just cannot wrap my head around. 

I am unable to find any documentation that gets into such nitty gritties of Spark. I am writing in hopes to find some help. Do you have any documentation that explains how a function (org.apache.spark.sql.function._) is turned into a logical plan?
Reply | Threaded
Open this post in threaded view
|

Re: Documentation on org.apache.spark.sql.functions backend.

Vipul Rajan
Hi Marco,

That does help. Thanks, for taking the time. I am confused as to how that Expression is created. There are methods like eval, nullSafeEval, doGenCode. Aren't there any architectural docs that could help with what is exactly happening? Reverse engineering seems a bit daunting.

Regards

On Mon, Sep 16, 2019 at 1:36 PM Marco Gaido <[hidden email]> wrote:
Hi Vipul,

a function is never turned in a logical plan. A function is turned into an Expression. And an Expression can be part of many Logical or Physical Plans.
Hope this helps.

Thanks,
Marco

Il giorno lun 16 set 2019 alle ore 08:27 Vipul Rajan <[hidden email]> ha scritto:
I am trying to create a function that reads data from Kafka, communicates with confluent schema registry and decodes avro data with evolving schemas. I am trying to not create hack-ish patches and to write proper code that I could maybe even create pull requests for. looking at the code I have been able to figure out a few things regarding how expressions are generated and how they help to accomplish what a function does, but there is still a ton I just cannot wrap my head around. 

I am unable to find any documentation that gets into such nitty gritties of Spark. I am writing in hopes to find some help. Do you have any documentation that explains how a function (org.apache.spark.sql.function._) is turned into a logical plan?
Reply | Threaded
Open this post in threaded view
|

Re: Documentation on org.apache.spark.sql.functions backend.

Marco Gaido
Hi Vipul,

I am afraid I cannot help you on that.

Thanks,
Marco

Il giorno lun 16 set 2019 alle ore 10:44 Vipul Rajan <[hidden email]> ha scritto:
Hi Marco,

That does help. Thanks, for taking the time. I am confused as to how that Expression is created. There are methods like eval, nullSafeEval, doGenCode. Aren't there any architectural docs that could help with what is exactly happening? Reverse engineering seems a bit daunting.

Regards

On Mon, Sep 16, 2019 at 1:36 PM Marco Gaido <[hidden email]> wrote:
Hi Vipul,

a function is never turned in a logical plan. A function is turned into an Expression. And an Expression can be part of many Logical or Physical Plans.
Hope this helps.

Thanks,
Marco

Il giorno lun 16 set 2019 alle ore 08:27 Vipul Rajan <[hidden email]> ha scritto:
I am trying to create a function that reads data from Kafka, communicates with confluent schema registry and decodes avro data with evolving schemas. I am trying to not create hack-ish patches and to write proper code that I could maybe even create pull requests for. looking at the code I have been able to figure out a few things regarding how expressions are generated and how they help to accomplish what a function does, but there is still a ton I just cannot wrap my head around. 

I am unable to find any documentation that gets into such nitty gritties of Spark. I am writing in hopes to find some help. Do you have any documentation that explains how a function (org.apache.spark.sql.function._) is turned into a logical plan?