Compiled code such as C consumes little memory.
Interpreted code such as Python consumes more memory, which is understandable.
With JIT, a program is (selectively) compiled into machine code at run time. So shouldn’t the memory consumption of a JIT’ed program be somewhere between that of a compiled and an interpreted program?
Instead a JIT’ed program (such as PyPy) consume several times more memory than the equivalent interpreted program (such as Python). Why?
Tracing JIT compilers take quite a bit more memory due to the fact that they need to keep not only the bytecode for the VM, but also the directly executable machine code as well. this is only half the story however.
Most JIT’s will also keep a lot of meta data about the bytecode (and even the machine code) to allow them to determine what needs to be JIT’ed and what can be left alone. Tracing JIT’s (such as LuaJIT) also create trace snapshots which are used to fine tune code at run time, performing things like loop unrolling or branch reordering.
Some also keep caches of commonly used code segments or fast lookup buffers to speed up creation of JIT’ed code (LuaJIT does this via DynAsm, it can actually help reduce memory usage when done correctly, as is the case with dynasm).
The memory usage greatly depends on the JIT engine employed and the nature of the language it compiles (strongly vs weakly-typed). some JIT’s employ advanced techniques such as SSA based register allocators and variable livelyness analysis, these sort of optimizations helps consume memory as well, along with the more common things like loop variable hoisting.