I wrote some mapreduce jobs that reference a few external jars.
so I added them into the CLASSPATH of the “running” cluster in order to run jobs.
Once I tried to run them, I got class not found exceptions.
I Googled for ways to fix it and I found that I needed to restart the cluster for applying
the changed CLASSPATH, and it actually worked.
Oh, yuck!
Should I really need to restart a cluster every time I add new jars into the CLASSPATH?
I don’t think that it makes sense.
Does anyone know how to apply the changes without restarting them?
I think I need to add some detail to beg your advice.
I wrote a custom hbase filter class and packed it in a jar.
And I wrote a mapreduce job that uses the custom filter class and packed it in an another jar.
Because the filter class jar wasn’t in the class path of my “running” cluster, I added it.
But I couldn’t succeed to run the job until I restarted the cluster.
Of course, I know I could packed the filter class and the job in a single jar together.
But I didn’t mean it.
And I’m curious I should restart the cluster again if I need to add new external jars?
Check the Cloudera article for including 3rd party libraries required for the Job. Option (1) and (2) don’t require the Cluster to be restarted.