I’m using Pig in an java application. Currently I have a thread that runs a pig query every 15mins. After every run I’m left with the MR Job Jar in my temp folder, in my case /tmp.
They way the code is structured is 1 instance of a PigServer is created on start up. Then on a loop I re-register a query with different partitions and execute the query via the openIterator call. The PigServer is not shutdown until the thread is shutdown.
So my question becomes is there a call I need to preform to clean up the jars? or do I need to shutdown the PigServer for every execution? or should I just clean up the FS myself after the query is completed?
It appears you do need to create and destroy your PigServer object after each use to clean up the pig* directories in the tmp space. However this doesn’t appear to clean up the Job jars. So I did have to institute my own cleanup function to handle this.