Basically you need something in Hadoop which uses the Avro SerDe (Serialiser/Deserialiser). That way you can read the Avro file just like any other. Your main choices are Hive, Pig, or HCatalog.
This page may help – but it is not a beginners tutorial https://cwiki.apache.org/confluence/display/Hive/AvroSerDe
Avro is slightly more complicated because you need to manually specify the schema – it is not automatically deduced from the file. (Hopefully that will change)
I don’t think you should start off with Avro – start off with a plain file.
Also you could create a Hive table which is an avro file and use THAT file to learn from.
↧
Reply To: New To Hadoop and Hortonworks – How to read an avro file??
↧