One mapper for a CSV import

I have a small HDP 2.3 cluster (5 nodes) that I’ve setup to get some experience with Hive + Tez. I have one or two tables that I’m creating using what I believe is a fairly simple DDL as follows:

CREATE TABLE SomeTable_csv(valueA int, valueB int, valueC timestamp)
ROW FORMAT
DELIMITED FIELDS TERMINATED BY ','
ESCAPED BY '\\'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/some/long/path'
TBLPROPERTIES ("skip.header.line.count"="1");

And then I load a single file into that table. I then try to turn it into a table backed by an ORC file, so I do something similar:

CREATE TABLE SomeTable(valueA int, valueB int, valueC timestamp)
STORED AS ORC
LOCATION '/some/long/path';

INSERT OVERWRITE TABLE SomeTable
 SELECT valueA,valueB,valueC from SomeTable_csv;

Apparently this produces a Tez task that has 1 mapper and 1 reducer and takes hours to run (input file is about 50GB). I expected that it would make a reasonable attempt to use more mappers since it’s a simple mapping process – possibly aligning to the HDFS block size. Any hints about what I can do to try to get the process to break up into more than one mapper. I can split the input file before importing into HDFS, but seems like that’s what the task should be doing.

Thanks

One mapper for a CSV import – ORC export

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112