Update: Nothing wrong with the tutorial, just very very slow. If I stop the python script in 5 mins, I get approx 100 rows with HCatalog. If I leave the python script running for almost an hour, I get 500 rows with HCatalog. Sadly, the eventlog has 250,000+ rows!!! So either leave the python script for a day or two or try this: http://kzhendev.wordpress.com/2014/04/06/apache-flume-get-logs-out-of-rabbitmq-and-into-hdfs/
↧