More publicly documenting the options under spark.sql.*

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

More publicly documenting the options under spark.sql.*

Nicholas Chammas
I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.

Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?

Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?

Nick

Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Sean Owen-2
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Hyukjin Kwon
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Hyukjin Kwon
Resending to the dev list for archive purpose:

I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 22:46 Hyukjin Kwon, <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Nicholas Chammas
So do we want to repurpose 
SPARK-30510 as an SQL config refactor?

Alternatively, what’s the smallest step forward I can take to publicly document partitionOverwriteMode (which was my impetus for looking into this in the first place)?

2020년 1월 15일 (수) 오전 8:49, Hyukjin Kwon <[hidden email]>님이 작성:
Resending to the dev list for archive purpose:

I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 22:46 Hyukjin Kwon, <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Felix Cheung
In reply to this post by Hyukjin Kwon
I think it’s a good idea


From: Hyukjin Kwon <[hidden email]>
Sent: Wednesday, January 15, 2020 5:49:12 AM
To: dev <[hidden email]>
Cc: Sean Owen <[hidden email]>; Nicholas Chammas <[hidden email]>
Subject: Re: More publicly documenting the options under spark.sql.*
 
Resending to the dev list for archive purpose:

I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 22:46 Hyukjin Kwon, <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Shixiong(Ryan) Zhu
In reply to this post by Hyukjin Kwon
"spark.sql("set -v")" returns a Dataset that has all non-internal SQL configurations. Should be pretty easy to automatically generate a SQL configuration page.

Best Regards,

Ryan


On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Takeshi Yamamuro
The idea looks nice. I think web documents always help end users.

Bests,
Takeshi

On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <[hidden email]> wrote:
"spark.sql("set -v")" returns a Dataset that has all non-internal SQL configurations. Should be pretty easy to automatically generate a SQL configuration page.

Best Regards,

Ryan


On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--
---
Takeshi Yamamuro
Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Hyukjin Kwon
Nicholas, are you interested in taking a stab at this? You could refer https://github.com/apache/spark/commit/60472dbfd97acfd6c4420a13f9b32bc9d84219f3

2020년 1월 17일 (금) 오전 8:48, Takeshi Yamamuro <[hidden email]>님이 작성:
The idea looks nice. I think web documents always help end users.

Bests,
Takeshi

On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <[hidden email]> wrote:
"spark.sql("set -v")" returns a Dataset that has all non-internal SQL configurations. Should be pretty easy to automatically generate a SQL configuration page.

Best Regards,

Ryan


On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--
---
Takeshi Yamamuro
Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Jules Damji-2
In reply to this post by Shixiong(Ryan) Zhu
It’s one thing to get the names/values of the configurations, via the Spark.sql(“set -v”), but another thing to understand what each achieves and when and why you’ll want to use it. 

A webpage with a table and description of each is huge benefit. 

Cheers 
Jules 

Sent from my iPhone
Pardon the dumb thumb typos :)

On Jan 16, 2020, at 11:04 AM, Shixiong(Ryan) Zhu <[hidden email]> wrote:


"spark.sql("set -v")" returns a Dataset that has all non-internal SQL configurations. Should be pretty easy to automatically generate a SQL configuration page.

Best Regards,

Ryan


On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Hyukjin Kwon
Each configuration has its documentation already. What we need to do would be just to list up.

2020년 1월 17일 (금) 오후 12:25, Jules Damji <[hidden email]>님이 작성:
It’s one thing to get the names/values of the configurations, via the Spark.sql(“set -v”), but another thing to understand what each achieves and when and why you’ll want to use it. 

A webpage with a table and description of each is huge benefit. 

Cheers 
Jules 

Sent from my iPhone
Pardon the dumb thumb typos :)

On Jan 16, 2020, at 11:04 AM, Shixiong(Ryan) Zhu <[hidden email]> wrote:


"spark.sql("set -v")" returns a Dataset that has all non-internal SQL configurations. Should be pretty easy to automatically generate a SQL configuration page.

Best Regards,

Ryan


On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Nicholas Chammas
In reply to this post by Hyukjin Kwon
I am! Thanks for the reference.

On Thu, Jan 16, 2020 at 9:53 PM Hyukjin Kwon <[hidden email]> wrote:
Nicholas, are you interested in taking a stab at this? You could refer https://github.com/apache/spark/commit/60472dbfd97acfd6c4420a13f9b32bc9d84219f3

2020년 1월 17일 (금) 오전 8:48, Takeshi Yamamuro <[hidden email]>님이 작성:
The idea looks nice. I think web documents always help end users.

Bests,
Takeshi

On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <[hidden email]> wrote:
"spark.sql("set -v")" returns a Dataset that has all non-internal SQL configurations. Should be pretty easy to automatically generate a SQL configuration page.

Best Regards,

Ryan


On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--
---
Takeshi Yamamuro
Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Hyukjin Kwon
FYI, PR was open at https://github.com/apache/spark/pull/27459. Thanks Nicholas.
Hope guys find some time to take a look.

2020년 1월 28일 (화) 오전 8:15, Nicholas Chammas <[hidden email]>님이 작성:
I am! Thanks for the reference.

On Thu, Jan 16, 2020 at 9:53 PM Hyukjin Kwon <[hidden email]> wrote:
Nicholas, are you interested in taking a stab at this? You could refer https://github.com/apache/spark/commit/60472dbfd97acfd6c4420a13f9b32bc9d84219f3

2020년 1월 17일 (금) 오전 8:48, Takeshi Yamamuro <[hidden email]>님이 작성:
The idea looks nice. I think web documents always help end users.

Bests,
Takeshi

On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <[hidden email]> wrote:
"spark.sql("set -v")" returns a Dataset that has all non-internal SQL configurations. Should be pretty easy to automatically generate a SQL configuration page.

Best Regards,

Ryan


On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--
---
Takeshi Yamamuro
Reply | Threaded
Open this post in threaded view
|

Re: More publicly documenting the options under spark.sql.*

Hyukjin Kwon
The PR was merged. Now all external SQL configurations will be automatically documented.

2020년 2월 5일 (수) 오전 9:46, Hyukjin Kwon <[hidden email]>님이 작성:
FYI, PR was open at https://github.com/apache/spark/pull/27459. Thanks Nicholas.
Hope guys find some time to take a look.

2020년 1월 28일 (화) 오전 8:15, Nicholas Chammas <[hidden email]>님이 작성:
I am! Thanks for the reference.

On Thu, Jan 16, 2020 at 9:53 PM Hyukjin Kwon <[hidden email]> wrote:
Nicholas, are you interested in taking a stab at this? You could refer https://github.com/apache/spark/commit/60472dbfd97acfd6c4420a13f9b32bc9d84219f3

2020년 1월 17일 (금) 오전 8:48, Takeshi Yamamuro <[hidden email]>님이 작성:
The idea looks nice. I think web documents always help end users.

Bests,
Takeshi

On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <[hidden email]> wrote:
"spark.sql("set -v")" returns a Dataset that has all non-internal SQL configurations. Should be pretty easy to automatically generate a SQL configuration page.

Best Regards,

Ryan


On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <[hidden email]> wrote:
I think automatically creating a configuration page isn't a bad idea because I think we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in functions and I'm pretty sure we can do the similar thing for configurations as well.


On Wed, 15 Jan 2020, 10:46 Sean Owen, <[hidden email]> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
<[hidden email]> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf as well as for whatever documentation page we want to generate. What do you think of that idea?
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--
---
Takeshi Yamamuro