Quantcast
Viewing all articles
Browse latest Browse all 3435

tez performance tuning

hello

is there any way to tune tez for better performance. I have a dataset of 30.5 GB and my joins are taking over 200 seconds

select d.per_id, count(*), sum(lo_ioh_ext_cost_dlrs) from dim_period d join fct_ioh_Day_str_pln f on
(d.per_id=f.per_id)
WHERE d.per_id in (13879, 13880,13881,13882)
group by d.per_id order by d.per_id

I noticed when i run this query that the node i am running from is not using much of the machine capacity.. even though max capacity is set to 75%

CPU and mem ustilization is low.

can someone help .. or is 200+ seconds normal for a 6 node cluster when joining on a 30 GB dataset

Please note that i am using ORC tables that are partitioned and bucketed

Thanks


Viewing all articles
Browse latest Browse all 3435

Trending Articles