I have a very simple pig script:
REGISTER /home/sn3584/MongoDB/jython-standalone-2.7.0.jar
— python functions
REGISTER /home/sn3584/MongoDB/test.py using org.apache.pig.scripting.jython.JythonScriptEngine as testudf ;
A = LOAD ‘/user/sn3584/Data/myfile.txt’ as t:int;
B = FOREACH A GENERATE t,testudf.helloworld() AS Hello:chararray ;
dump B ;
test.py is as follows:
#!/usr/bin/python
@outputSchema(‘word:chararray’)
def helloworld():
return ‘Hello, World’
and myfile.txt is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
When running in local mode, it runs succesffully and produces:
(1,Hello, World)
(2,Hello, World)
(3,Hello, World)
(4,Hello, World)
(5,Hello, World)
(6,Hello, World)
(7,Hello, World)
(8,Hello, World)
(9,Hello, World)
(10,Hello, World)
(11,Hello, World)
(12,Hello, World)
(13,Hello, World)
(14,Hello, World)
(15,Hello, World)
But when running in default mapreduce mode:
[sn3584@sandbox MongoDB]$ pig MongoTestUDF.pig
ERROR org.apache.pig.tools.grunt.Grunt – ERROR 2997: Unable to recreate exception from backed error: Error: java.io.IOException: Deserialization error: could not instantiate ‘org.apache.pig.scripting.jython.JythonFunction’ with arguments ‘[/home/sn3584/MongoDB/test.py, helloworld]’
Has anybody seen this error with the latest version of the sandbox? Any help will be much appreciated.
Many thanks.