Tag Archives: Import

Including external Pig files into Pig Latin scripts

In one of my projects, we had huge number of Pig scripts which dealt with data from a single source. The schema for this common data source is quite complex and changes every few months. Since this schema was present in all Pig files, when ever it changes, it was a real pain to update all Pig scripts.

I was looking for a way to separate out the schema into a separate Pig file and then include it in all other Pig scripts, like how you import a class in Java, instead of copy pasting it into all Pig files.

After some quick web searches, I found that from Pig 0.9 and above this feature is indeed available in Pig itself. It’s called macros. All you need to do is to just include the following line in your Pig script where you need it to be included.

import 'other-file.pig'

You can either give relative path in the above line or set the search path as well from where Pig should include the scripts. If you want to include the search path, then you can do something like this.

set pig.import.search.path '/usr/local/pig,/grid/pig';
import 'external-file.pig';

Now my Pig scripts are organized properly. Hope this helps you as well 🙂

Posted in Hadoop/Pig | Tagged , , | Leave a comment

[Poll] – RoloPress Importer

With the maintenance release out, we are planning to work on importers, which will allow you to import contact data from other programs.

I want you to help me to choose which importer I should work on first, which will be released in for the next version of RoloPress.

So please cast your vote and we will work on the importer which receives the most number of votes.

(You can cast your vote here directly, if you are not able to see the poll widget or if you are reading this post from a feed reader)

Thanks 🙂

Posted in WordPress | Tagged , , | 5 Comments