I’m trying to improve the speed of an algorithm and, after looking at which operations are being called, I’m having difficulty pinning down exactly what’s slowing things up. I’m wondering if Python’s deepcopy() could possibly be the culprit or if I should look a little further into my own code.
Share
Looking at the code (you can too), it goes through every object in the tree of referenced objects (e.g. dict’s keys and values, object member variables, …) and does two things for them:
memodictThe second one is O(1) for simple objects. For composite objects, the same routine handles them, so over all n objects in the tree, that’s O(n). The first part, looking an object up in a dict, is O(1) on average, but O(n) amortized worst case.
So at best, on average,
deepcopyis linear. The keys used inmemoareid()values, i.e. memory locations, so they are not randomly distributed over the key space (the “average” part above) and it may behave worse, up to the O(n^2) worst case. I did observe some performance degradations in real use, but for the most part, it behaved as linear.That’s the complexity part, but the constant is large and
deepcopyis anything but cheap and could very well be causing your problems. The only sure way to know is to use a profiler — do it. FWIW, I’m currently rewriting terribly slow code that spends 98% of its execution time indeepcopy.