When I wrote about using Python to write UDF functions for Pig, I mentioned that Pig would internally be using Jython to parse the code, but 99% of time this shouldn’t be an issue. But I hit the other 1% recently 🙂
I had a small piece of Python code that used the built-in json
module to parse JSON data. I converted that into a UDF function and when I tried to call it from Pig, I was getting “module not found” exception. After some quick checks, I found that the latest stable version of Jython is 2.5.x and json
module was added from 2.6
After some web searches, I came across jyson through a blog post about using JSON in Jython. jyson is an Java implementation of JSON codec for Jython 2.5 which can be used as a drop-in replacement for Python’s built-in json
module.
I downloaded jyson jar and then added it to Pig’s Dpig.additional.jars
property. In the Python code, I changed the import statement to import com.xhaus.jyson.JysonCodec as json
. After that everything started to work again 🙂
This was quite useful. Thank you.