Quantcast
Channel: Hortonworks » All Replies
Viewing all articles
Browse latest Browse all 3435

Hive UDF function request problem

$
0
0

Hello,
I have a list of values for a specific field (state) sorted by dates. I want do display only the lines which state has changed from the previous date:
Example input:

date state
2013-01-15 04:15:07.602 ON
2013-01-15 05:15:08.502 ON
2013-01-15 06:15:08.502 OFF
2013-01-15 07:15:08.502 ON
2013-01-15 08:15:08.502 ON
...

Output expected

date state
2013-01-15 04:15:07.602 ON
2013-01-15 06:15:08.502 OFF
2013-01-15 07:15:08.502 ON

My hiveql query is like this

select date, state from demo_bd where statechanged(state) sort by date

“statechanged” is my UDF java function that returns true only if the current state is different from the previous one. This function works fine in java.
My problem is that while it seems to work for the first hundreds values then it fails and sometimes (not everytime) I get the same state for 2 adjacent dates…
I really don’t see where the problem comes from. Is it related to the way and order hive process the data ?

Any help is really appreciated.

Thank you.


Viewing all articles
Browse latest Browse all 3435

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>