I have EPD 7.3.1 installed (nowadays called Enthought Canopy), which comes with scikit-learn v 0.11. I am running Ubuntu 12.04. I need to install v 0.12 of scikit-learn.
The scikit-learn doc says clone the repository, add the scikit-learn directory to your PYTHONPATH, and build the extension in place: python setup.py build_ext --inplace
The problem is that EPD is its own closed world (with mulitple scikit dirs):
./lib/python2.7/site-packages/scikits/
./lib/python2.7/site-packages/sklearn
And then there’s:
./EGG-INFO/scikit_learn/
I really don’t want to experiment as it has taken a very long time to get things tuned to this point. Should I follow scikit-learn’s directions in this case?
The actions described on the scikit-learn website work irrespective of the scikit-learn version in EPD. Python will automatically use the scikit-learn version set in the
PYTHONPATHenvironment variable, which you should set to the directory path of the Git version of scikit-learn.If you use Bash on a Unix-like system, you should do the following:
/home/yourname/bin/scikit-learn).bashrcand add the line:export PYTHONPATH="/home/yourname/bin/scikit-learn";pythonimport sklearnsklearn.__verion__this should now show'0.12-git'instead of0.11Why does this work? Python uses the variable
sys.path(alistof paths) internally to keeps track of all the directories where it should look for modules and packages. Once a module or package is requested, Python will sequentially go through this list until it has found a match. So, e.g., a module can be listed multiple times insys.path, but only the version which appeared first in the list will be used.Every Python installation will have its own default set of paths listed in
sys.path. One way of extendingsys.pathis by listing paths inPYTHONPATH. Once Python starts it will read this environment variable and add it to the start of thesys.pathlist. So if you add the path to another version of scikit-learn to yourPYTHONPATHthen (EPD’s) Python will find that version of scikit-learn first and use it instead of the version listed further on insys.path.To view
sys.path, simplyimport sysand thenprint sys.path. Also, e.g., if you only want to use the 0.12 version of scikit-learn in one Python program and use the 0.11 version as default in all other Python programs then you could leave thePYTHONPATHempty and only insert the path to scikit-learn 0.12 manually at the top of your code: