Quantcast
Channel: Hortonworks » All Replies
Viewing all articles
Browse latest Browse all 3435

How to read contents of a SequenceFile in Pig generated by Sqoop import?

$
0
0

Pig’s SequenceFileLoader is unable to read textual data from a sequence file generated by Sqoop. It returns (0,) instead of keys and contents. Any ideas of why it loads incorrectly? Am I missing any steps?

Details:

I create a simple two column table (seq_proto) in MySQL (int, char(10) ) and inserted a row, namely (768, mango).

I use sqoop to import data from seq_proto into HDFS as a sequence file. (I validate the successful import by exporting the sequence file into MySQL and inspecting its contents.)

sqoop import –connect jdbc:mysql://localhost/test –table seq_proto –username root –as-sequencefile -m 1 –target-dir /user/hdfs/seq_proto

I compile and package seq_proto.jar file, and upload seq_proto.jar and sqoop-1.4.4.2.1.1.0-385.jar on hdfs (in the folder /lib/pig/).

I execute the following Pig script.

REGISTER piggybank.jar

REGISTER /lib/pig/seq_proto.jar

REGISTER /lib/pig/sqoop-1.4.4.2.1.1.0-385.jar

DEFINE SequenceFileLoader org.apache.pig.piggybank.storage.SequenceFileLoader();

a = LOAD ‘/user/hdfs/seq_proto/part-m-00000′ using SequenceFileLoader AS (id:int, value:chararray);

describe a;

dump a;

The script executes successfully and I get the following output:
a: {id: int,value: chararray}

(0,)

I use Hortonworks’ Sandbox 2.1 for the above. Any ideas of why it loads incorrectly? Am I missing any steps?

Thanks and I look forward to your reply. :-)


Viewing all articles
Browse latest Browse all 3435

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>