Nov 30, 2013
After I figured out how I can use Python to create Pig UDF functions, I got interested in Jython and wanted to play around with it. So I installed it in my Mac through homebrew by executing the following command.
brew install jython
Everything got installed properly, I was able to run Jython after setting
up the following lines in my
export JYTHON_HOME=$(brew --prefix jython)/libexec
But there was one small annoyance. Every time I was staring Jython, I was getting the following strange warning.
expr: syntax error
After a couple of web searches, I stumbled upon an email thread in the Jython-users mailing list. Basically it was due to a bug in the bash script that is used to start Jython. I opened
/usr/local/Cellar/jython/2.5.3/libexec/bin/jython file and changed the line, which fixed the error.
if expr "$link" : '/' > /dev/null; then to
if expr "$link" : '[/]' > /dev/null; then
I am exploring Jython more and will keep you guys updated, if I find something interesting.
Posted in Python
Tagged brew, Homebrew, Jython, Python
Sep 17, 2013
In Hadoop/Pig, Python
When I wrote about using Python to write UDF functions for Pig, I mentioned that Pig would internally be using Jython to parse the code, but 99% of time this shouldn’t be an issue. But I hit the other 1% recently 🙂
I had a small piece of Python code that used the built-in
json module to parse JSON data. I converted that into a UDF function and when I tried to call it from Pig, I was getting “module not found” exception. After some quick checks, I found that the latest stable version of Jython is 2.5.x and
json module was added from 2.6
After some web searches, I came across jyson through a blog post about using JSON in Jython. jyson is an Java implementation of JSON codec for Jython 2.5 which can be used as a drop-in replacement for Python’s built-in
I downloaded jyson jar and then added it to Pig’s
Dpig.additional.jars property. In the Python code, I changed the import statement to
import com.xhaus.jyson.JysonCodec as json. After that everything started to work again 🙂
Posted in Hadoop/Pig, Python
Tagged JSON, Jython, Pig, Python, UDF