skip.header.line.count is ignored in HiveContext

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

skip.header.line.count is ignored in HiveContext

sunerhan1992@sina.com
hello,
       I've got a table in Hive(path located to csv formatted files) which
is configured to skip the header row using
TBLPROPERTIES("skip.header.line.count"="1").
When querying from Hive the header row is not included in the data, but when
running the same query via HiveContext I get the header row.
"show create table " via the HiveContext confirms that it is aware of the
setting.
 
 
about this problem, it is for spark 1.5.1 and it is closed without fixing.
I want to whether this problem will be fixed in spark 2.1?


Reply | Threaded
Open this post in threaded view
|

Re: skip.header.line.count is ignored in HiveContext

Dongjoon Hyun-2
Hi, 

For 2.1.X, 2.1.2 was already out and I don't think 2.1.3 will be out.

At that time, I made a PR for SPARK-11374 based on Spark 2.0. I believe the patch works in 2.1, too. If you need, you can try.

At that time, it was not accepted as you know, 'Won't Fix' was the community decision at Spark 2.X. So, I guess 2.2.1 or 2.3 will not have a fix for that.

Bests,
Dongjoon.


On Thu, Nov 9, 2017 at 5:59 PM, [hidden email] <[hidden email]> wrote:
hello,
       I've got a table in Hive(path located to csv formatted files) which
is configured to skip the header row using
TBLPROPERTIES("skip.header.line.count"="1").
When querying from Hive the header row is not included in the data, but when
running the same query via HiveContext I get the header row.
"show create table " via the HiveContext confirms that it is aware of the
setting.
 
 
about this problem, it is for spark 1.5.1 and it is closed without fixing.
I want to whether this problem will be fixed in spark 2.1?