I am faced with the task of building a new component to be integrated into a large existing C codebase. The component is essentially a kind of compiler, and will be complicated enough that I would like to write it in OCaml (for reasons along the lines of those given here). I know that OCaml-C interaction is possible (as per the manual and this tutorial), but it looks somewhat painful.
What I’d like to know is whether others here have attempted large-scale integration of OCaml and C code, what were some of the unexpected gotchas they found, and whether at the end of the day they concluded that they would have been better off just writing the new code in C.
Note, I’m not trying to start a debate about the merits of functional versus imperative programming: let’s just say we assume that OCaml happens to be the right tool for the job I have in mind, and the potential difficulty in integration is the only issue. I also don’t have the option of rewriting the rest of the codebase.
To give a little more detail about the task: the component I need to implement is a certain kind of query optimizer that incorporates some research ideas my group at UC Davis is working on, and will be integrated into PostgreSQL so that we can run experiments. (A query optimizer is, essentially, a compiler.) The component would be invoked from C code, would function mostly independently but would make a certain number of calls to other PostgreSQL components to retrieve things like system catalog information, and would construct a complex C data structure (representing a physical query plan) as output.
Apologies for the somewhat open-ended question, but I’m hoping the community might be able to save me a little trouble 🙂
Thanks,
TJ
Great question. You should be using the better tool for the job.
If in fact your intentions are to use the better tool for the job (and you are sure lexx and yacc are going to be a pain) then I have something to share with you; it’s not painful at all to call ocaml from c, and vice versa. Most of the time I’ve been writing ocaml calling C, but I have written a few the other way. They’ve mostly been debug functions that don’t return a result. Although, the callings back and fourth is really about packing and unpacking the ocaml
valuetype on the C side. That tutorial you mention covers all of that, and very well.I’m opposed to Ron Savage remarks that you have to be an expert in the language. I recall starting out where I work, and within a few months, without knowing what a “functor” was, being able to call C, and writing a thousand lines of C for numerical recipes, and abstract data types, and there were some hiccups (not with unpacking types, but with garbage collection of an abstract data-types), but it wasn’t bad at all. Most of the inner loops in the project are written in C –taking advantage of SSE, external libraries (lapack), tighter optimized loops, and some in-lined hand optimized assembly.
I think you might need to be experienced with designing a large project and demarcating functional and imperative sections. I would really assess how much ocaml you are going to be writing, and what kind of values you want to pass to C –I’m saying this because I’d be fearful of recommending to someone to pass a recursive data-structure from ocaml to C, actually, it would be lots of unpacking tuples, their contents, and thus a lot of possibility for confusion and bugs.