Quantcast
Channel: Hortonworks » All Replies
Viewing all articles
Browse latest Browse all 3435

Reply To: Performance improvement techniques for Spark SQL DataFrames

$
0
0

If you access the files via Hive (instead of loading them into spark directly) then perhaps you could store the files in Hive tables partitioned on the join columns? Just a thought since you mentioned that you do a lot of joins.


Viewing all articles
Browse latest Browse all 3435

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>