Index optimization

Hi. I am attempting to optimize a query using indexing. My current query converts an ipv4 address to a country using a geolocation table. However, the geolocation table is fairly large and the query takes an impractical amount of time. I have created indexes and set the binary search parameter to true (default), but the query is not faster. Note that I am using Tez as the execution engine.

Here is how I set up indexing:

set hive.optimize.index.filter=true;

DROP INDEX IF EXISTS ipv4indexes ON ipv4geotable;
CREATE INDEX ipv4indexes
ON TABLE ipv4geotable (StartIp, EndIp)
AS ‘org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler’
WITH DEFERRED REBUILD
IDXPROPERTIES (‘hive.index.compact.binary.search’=’true’);
ALTER INDEX ipv4indexes ON ipv4geotable REBUILD;

And here is my query:

DROP TABLE IF EXISTS ipv4table;
CREATE TABLE ipv4table AS
SELECT logon.IP, ipv4.Country
FROM
(SELECT * FROM logontable WHERE isIpv4(IP)) logon
LEFT OUTER JOIN
(SELECT StartIp, EndIp, Country FROM ipv4geotable) ipv4 ON isIpv4(logon.IP)
WHERE ipv4.StartIp <= logon.IP AND logon.IP <= ipv4.EndIp;

What the query is doing is extracting an IP from logontable and finding in which range it lies within the geolocation table (which is sorted). When a range is found, the corresponding country is returned. I suspect that Hive goes through the whole table row by row rather than performing a smart search (ex: binary search).

Any suggestions on how to speed things up? Thanks!

Index optimization

Trending Articles

Drug dealing brothers caught with £74k stash in Newtown Linford home

Hull man, 27, dies after crashing car into a tree on the A165 near Brandesburton

Police charge man, 23, with assault and criminal damage following incident in...

Brunei reaffirms healthcare commitment

Practice Sheet of Right form of verbs for HSC Students

JESSIE ROGERSON ON JULY 10, 20...

DMG Audio Limitless v1.01 WiN/OSX Incl Patched and Keygen-R2R

Madonna – Behind Me (feat. Guido Dos Santos) – Single [iTunes Plus M4A]

Sri Lankan Actress Nadeesha Hemamali Hot Shoot

A/L Technology Stream – Subject combinations, Syllabuses and Teacher guides

Scanmatik 2 SM2 clone diver v2.21.22 free no pass

Who’s been sentenced from Corby, Kettering, Ringstead, Rothwell, Rushden,...

Anthony Wahome Biography, Family, Wife and Children

Best 5 Happy Mothers Day Poems For Step Mother

Materials Around Us Class 6 Worksheet Science Chapter 6

Police confirm man stabbed to death in Selsdon was Andrew David Else of Croydon

Hyper-V replication "Enabling Replication Failed"

Laura Pausini - Platinum Collection (3Cd) (2009) .mp3 - 320 Kbps

Joseph Bradley – Carlisle

Stories • Goddess Stepmom