I am using Scipy’s KDTree implementation to read a large file of 300 MB. Now, is there a way I can just save the datastructure to disk and load it again or am I stuck with reading raw points from file and constructing the data structure each time I start my program? I am constructing the KDTree as follows:
def buildKDTree(self):
self.kdpoints = numpy.fromfile("All", sep=' ')
self.kdpoints.shape = self.kdpoints.size / self.NDIM, NDIM
self.kdtree = KDTree(self.kdpoints, leafsize = self.kdpoints.shape[0]+1)
print "Preparing KDTree... Ready!"
Any suggestions please?
KDtree uses nested classes to define its node types (innernode, leafnode). Pickle only works on module-level class definitions, so a nested class trips it up:
However, there is a (hacky) workaround by monkey-patching the class definitions into the
scipy.spatial.kdtreeat module scope so the pickler can find them. If all of your code which reads and writes pickled KDtree objects installs these patches, this hack should work fine:Output: