I have a hive query that has like 6-7 joins inside it. I decided to use hive.exec.parallel(For Parallel Mapreduce) to speed up the process. The query when executed from hive CLI takes about 6 minutes , But when i try the same it takes ages and i am not sure if it completes or not. I have added hive.exec.parallel as the property and could see in the workflow.xml as below.
hive.exec.parallel
true
I use YARN and was wondering if it has anything to do with it ? As of now i execute the script as SSH action so as to use the HIVE CLI in the edge node . Is there a way i can use hive action and get this completed ?
if i code
set hive.exec.parallel=true
inside the script , i believe it does the same job right ?
I was wondering if this has anything to do with default queue name that makes mapreduce program in running status to go in some kind of dead lock