Suppose I sometime would like to write a compiler that generates C code for a language that:
- has purely reference counting GC,
- lets you return to a point multiple frames up the stack (stealing Common Lisp’s
(return-from)form), - and lets you define a destructor for each data type (to be called immediately before the memory for the data structure is freed).
How would you go about implementing this? Could you do it (well) with C’s setjmp() and longjmp() and a global list of points to return to for clean up while unwinding the stack?
Another possibility is to just generate C++ code.
There are a couple of ways I’d consider first off.
One is similar to exception frames. Create a per-thread linked list of structures. Before calling a function, create a structure on the stack and add it to the end of the list. The structure contains a
jmp_buf. Callsetjmpand if it returns 0 continue with this function, otherwise (somehow) check to see whether you are the target of thereturn-from— if so then continue with the function, otherwise clean up your locals andlongjmpthe previous frame. I think this might be overkill, it depends how flexible thisreturn-fromneeds to be. Is it necessary only to return from a function that in a surrounding lexical scope, or are we actually searching up the call stack for a named function that may or may not even be present? If the former, I doubt that we need this.Another possibility is to make all reference-counted types implement a common interface (internally to the language implementation, this is – it needn’t be visible to users). Then you can just create a stack of (pointers to) objects that require clean up, together with the ability to create index points in that stack corresponding to call-stack levels. On function exit, you clean up all variables below the stack level to which you are returning, just by looping through the list dereffing each one rather than needing to jump to cleanup code segments in the routines above us on the call stack. Then you could
longjmpstraight to the target, or you could come up with a “calling convention” in which the return value as far as your language is concerned is actually stored in a location determined by a pointer parameter (an out-param), while the return value as far as C is concerned indicates where on the stack you’re returning to. Callers therefore check whether the return value matches their own level, and if not return immediately, and there’s no need forlongjmp. It may or may not be more efficient tolongjmp, depending how many stack levels you’re skipping, and hence how many repetitions of check-and-return.This scheme is a bit like the cleanup stack in Symbian/C++. In fact that goes a bit further – it’s not the resource to be cleaned up that has to implement the common interface, what goes on the stack is a TCleanupItem consisting of a function that knows how to free the resource, and some data to feed to that function.