This is pretty bad micro-optimizing, but I’m just curious. It usually doesn’t make a difference in the “real” world.
So I’m compiling a function (that does nothing) using compile() then calling exec on that code and getting a reference to the function I compiled. Then I’m executing it a couple million times and timing it. Then repeating it with a local function. Why is the dynamically compiled function around 15% slower (on python 2.7.2) for just the call?
import datetime
def getCompiledFunc():
cc = compile("def aa():pass", '<string>', 'exec')
dd = {}
exec cc in dd
return dd.get('aa')
compiledFunc = getCompiledFunc()
def localFunc():pass
def testCall(f):
st = datetime.datetime.now()
for x in xrange(10000000): f()
et = datetime.datetime.now()
return (et-st).total_seconds()
for x in xrange(10):
lt = testCall(localFunc)
ct = testCall(compiledFunc)
print "%s %s %s%% slower" % (lt, ct, int(100.0*(ct-lt)/lt))
The output I’m getting is something like:
1.139 1.319 15% slower
The dis.dis() function shows that the code object for each version is identical:
So the difference is in the function object. I compared each of the fields (func_doc, func_closure, etc) and the one that is different is func_globals. In other words,
localFunc.func_globals != compiledFunc.func_globals.There is a cost for supplying your own dictionary instead of the built-in globals (the former has to be looked up when a stack frame is created on each call and the latter can be referenced directly by the C code which already knows about the default builtin globals dictionary).
This is easy verified by changing the exec line in your code to:
With that change, the timing difference goes away.
Mystery solved!