Quantcast
Channel: Hortonworks » All Replies
Viewing all articles
Browse latest Browse all 3435

one reducer only when inserting into an orc dynimicaly partitioned table

$
0
0

Hi,

I am running on 10 node cluster hdp 2.2.
Using tez and yarn.
hive version is 0.14

I have a 90 milion row table stroed in a plain text csv 10GB text file.

When trying to insert into an orc partitioned table using the statement:

“insert overwrite table 2h2 partition (dt) select *,TIME_STAMP from 2h_tmp;”

dt is the dynamic partition key.

Tez alloactes only one reducer to the job which results in a 6 hour run.

I expect about 120 partions to be created .

How can I increase number of reducers to speed up this job?

Is this related to https://issues.apache.org/jira/browse/HIVE-7158 , it is marked as resolved for hive 0.14

I am running with default values

hive.tez.auto.reducer.parallelism

Default Value: false
Added In: Hive 0.14.0 with HIVE-7158

hive.tez.max.partition.factor

Default Value: 2
Added In: Hive 0.14.0 with HIVE-7158

hive.tez.min.partition.factor

Default Value: 0.25
Added In: Hive 0.14.0 with HIVE-7158

and hive.exec.dynamic.partition=true;
hive.exec.dynamic.partition.mode=nonstrict;


Viewing all articles
Browse latest Browse all 3435

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>