dev/merge_spark_pr.py broken on python 2

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

dev/merge_spark_pr.py broken on python 2

Marcelo Vanzin-2
Hey all,

Something broke that script when running with python 2.

I know we want to deprecate python 2, but in that case, scripts should
at least be changed to use "python3" in the shebang line...

--
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: dev/merge_spark_pr.py broken on python 2

Sean Owen-2
Hm, the last change was on Oct 1, and should have actually helped it
still work with Python 2:
https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c

Hasn't otherwise changed in a while. What's the error?

On Fri, Nov 8, 2019 at 11:37 AM Marcelo Vanzin
<[hidden email]> wrote:

>
> Hey all,
>
> Something broke that script when running with python 2.
>
> I know we want to deprecate python 2, but in that case, scripts should
> at least be changed to use "python3" in the shebang line...
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: dev/merge_spark_pr.py broken on python 2

Marcelo Vanzin-2
Something related to non-ASCII characters. Worked fine with python 3.

git branch -D PR_TOOL_MERGE_PR_26426_MASTER
Traceback (most recent call last):
  File "./dev/merge_spark_pr.py", line 577, in <module>
    main()
  File "./dev/merge_spark_pr.py", line 552, in main
    merge_hash = merge_pr(pr_num, target_ref, title, body, pr_repo_desc)
  File "./dev/merge_spark_pr.py", line 147, in merge_pr
    distinct_authors[0])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in
position 65: ordinal not in range(128)
M       docs/running-on-kubernetes.md
Already on 'master'
Your branch is up to date with 'apache-github/master'.
error: cannot pull with rebase: Your index contains uncommitted changes.
error: please commit or stash them.

On Fri, Nov 8, 2019 at 10:17 AM Sean Owen <[hidden email]> wrote:

>
> Hm, the last change was on Oct 1, and should have actually helped it
> still work with Python 2:
> https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c
>
> Hasn't otherwise changed in a while. What's the error?
>
> On Fri, Nov 8, 2019 at 11:37 AM Marcelo Vanzin
> <[hidden email]> wrote:
> >
> > Hey all,
> >
> > Something broke that script when running with python 2.
> >
> > I know we want to deprecate python 2, but in that case, scripts should
> > at least be changed to use "python3" in the shebang line...
> >
> > --
> > Marcelo
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: [hidden email]
> >



--
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: dev/merge_spark_pr.py broken on python 2

Sean Owen-2
Ah OK. I think it's the same type of issue that the last change
actually was trying to fix for Python 2. Here it seems like the author
name might have non-ASCII chars?
I don't immediately know enough to know how to resolve that for Python
2. Something with how raw_input works, I take it. You could 'fix' the
author name if that's the case, or just use python 3.

On Fri, Nov 8, 2019 at 12:20 PM Marcelo Vanzin <[hidden email]> wrote:

>
> Something related to non-ASCII characters. Worked fine with python 3.
>
> git branch -D PR_TOOL_MERGE_PR_26426_MASTER
> Traceback (most recent call last):
>   File "./dev/merge_spark_pr.py", line 577, in <module>
>     main()
>   File "./dev/merge_spark_pr.py", line 552, in main
>     merge_hash = merge_pr(pr_num, target_ref, title, body, pr_repo_desc)
>   File "./dev/merge_spark_pr.py", line 147, in merge_pr
>     distinct_authors[0])
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in
> position 65: ordinal not in range(128)
> M       docs/running-on-kubernetes.md
> Already on 'master'
> Your branch is up to date with 'apache-github/master'.
> error: cannot pull with rebase: Your index contains uncommitted changes.
> error: please commit or stash them.
>
> On Fri, Nov 8, 2019 at 10:17 AM Sean Owen <[hidden email]> wrote:
> >
> > Hm, the last change was on Oct 1, and should have actually helped it
> > still work with Python 2:
> > https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c
> >
> > Hasn't otherwise changed in a while. What's the error?
> >
> > On Fri, Nov 8, 2019 at 11:37 AM Marcelo Vanzin
> > <[hidden email]> wrote:
> > >
> > > Hey all,
> > >
> > > Something broke that script when running with python 2.
> > >
> > > I know we want to deprecate python 2, but in that case, scripts should
> > > at least be changed to use "python3" in the shebang line...
> > >
> > > --
> > > Marcelo
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe e-mail: [hidden email]
> > >
>
>
>
> --
> Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: dev/merge_spark_pr.py broken on python 2

Marcelo Vanzin-2
I remember merging PRs with non-ascii chars in the past...

Anyway, for these scripts, might be easier to just use python3 for
everything, instead of trying to keep them working on two different
versions.

On Fri, Nov 8, 2019 at 10:28 AM Sean Owen <[hidden email]> wrote:

>
> Ah OK. I think it's the same type of issue that the last change
> actually was trying to fix for Python 2. Here it seems like the author
> name might have non-ASCII chars?
> I don't immediately know enough to know how to resolve that for Python
> 2. Something with how raw_input works, I take it. You could 'fix' the
> author name if that's the case, or just use python 3.
>
> On Fri, Nov 8, 2019 at 12:20 PM Marcelo Vanzin <[hidden email]> wrote:
> >
> > Something related to non-ASCII characters. Worked fine with python 3.
> >
> > git branch -D PR_TOOL_MERGE_PR_26426_MASTER
> > Traceback (most recent call last):
> >   File "./dev/merge_spark_pr.py", line 577, in <module>
> >     main()
> >   File "./dev/merge_spark_pr.py", line 552, in main
> >     merge_hash = merge_pr(pr_num, target_ref, title, body, pr_repo_desc)
> >   File "./dev/merge_spark_pr.py", line 147, in merge_pr
> >     distinct_authors[0])
> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in
> > position 65: ordinal not in range(128)
> > M       docs/running-on-kubernetes.md
> > Already on 'master'
> > Your branch is up to date with 'apache-github/master'.
> > error: cannot pull with rebase: Your index contains uncommitted changes.
> > error: please commit or stash them.
> >
> > On Fri, Nov 8, 2019 at 10:17 AM Sean Owen <[hidden email]> wrote:
> > >
> > > Hm, the last change was on Oct 1, and should have actually helped it
> > > still work with Python 2:
> > > https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c
> > >
> > > Hasn't otherwise changed in a while. What's the error?
> > >
> > > On Fri, Nov 8, 2019 at 11:37 AM Marcelo Vanzin
> > > <[hidden email]> wrote:
> > > >
> > > > Hey all,
> > > >
> > > > Something broke that script when running with python 2.
> > > >
> > > > I know we want to deprecate python 2, but in that case, scripts should
> > > > at least be changed to use "python3" in the shebang line...
> > > >
> > > > --
> > > > Marcelo
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe e-mail: [hidden email]
> > > >
> >
> >
> >
> > --
> > Marcelo



--
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: dev/merge_spark_pr.py broken on python 2

Hyukjin Kwon
Yeah.. let's stick to Python 3 in general .. 
I plan to drop Python 2 completely right after Spark 3.0 release.

The exception you face .. seems like run_cmd now produces unicode instead of bytes in Python 2 with the merge script. Later, seems this unicode is attempted to be casted to bytes implicitly by %-formatting - IIRC implicit cast uses its default encoding which is ascii in Python.


On Sat, 9 Nov 2019, 03:32 Marcelo Vanzin, <[hidden email]> wrote:
I remember merging PRs with non-ascii chars in the past...

Anyway, for these scripts, might be easier to just use python3 for
everything, instead of trying to keep them working on two different
versions.

On Fri, Nov 8, 2019 at 10:28 AM Sean Owen <[hidden email]> wrote:
>
> Ah OK. I think it's the same type of issue that the last change
> actually was trying to fix for Python 2. Here it seems like the author
> name might have non-ASCII chars?
> I don't immediately know enough to know how to resolve that for Python
> 2. Something with how raw_input works, I take it. You could 'fix' the
> author name if that's the case, or just use python 3.
>
> On Fri, Nov 8, 2019 at 12:20 PM Marcelo Vanzin <[hidden email]> wrote:
> >
> > Something related to non-ASCII characters. Worked fine with python 3.
> >
> > git branch -D PR_TOOL_MERGE_PR_26426_MASTER
> > Traceback (most recent call last):
> >   File "./dev/merge_spark_pr.py", line 577, in <module>
> >     main()
> >   File "./dev/merge_spark_pr.py", line 552, in main
> >     merge_hash = merge_pr(pr_num, target_ref, title, body, pr_repo_desc)
> >   File "./dev/merge_spark_pr.py", line 147, in merge_pr
> >     distinct_authors[0])
> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in
> > position 65: ordinal not in range(128)
> > M       docs/running-on-kubernetes.md
> > Already on 'master'
> > Your branch is up to date with 'apache-github/master'.
> > error: cannot pull with rebase: Your index contains uncommitted changes.
> > error: please commit or stash them.
> >
> > On Fri, Nov 8, 2019 at 10:17 AM Sean Owen <[hidden email]> wrote:
> > >
> > > Hm, the last change was on Oct 1, and should have actually helped it
> > > still work with Python 2:
> > > https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c
> > >
> > > Hasn't otherwise changed in a while. What's the error?
> > >
> > > On Fri, Nov 8, 2019 at 11:37 AM Marcelo Vanzin
> > > <[hidden email]> wrote:
> > > >
> > > > Hey all,
> > > >
> > > > Something broke that script when running with python 2.
> > > >
> > > > I know we want to deprecate python 2, but in that case, scripts should
> > > > at least be changed to use "python3" in the shebang line...
> > > >
> > > > --
> > > > Marcelo
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe e-mail: [hidden email]
> > > >
> >
> >
> >
> > --
> > Marcelo



--
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]