I’m planning to code a library that should be usable by a large number of people in on a wide spectrum of platforms. What do I have to consider to design it right? To make this questions more specific, there are four “subquestions” at the end.
Choice of language
Considering all the known requirements and details, I concluded that a library written in C or C++ was the way to go. I think the primary usage of my library will be in programs written in C, C++ and Java SE, but I can also think of reasons to use it from Java ME, PHP, .NET, Objective C, Python, Ruby, bash scrips, etc… Maybe I cannot target all of them, but if it’s possible, I’ll do it.
Requirements
It would be to much to describe the full purpose of my library here, but there are some aspects that might be important to this question:
- The library itself will start out small, but definitely will grow to enormous complexity, so it is not an option to maintain several versions in parallel.
- Most of the complexity will be hidden inside the library, though
- The library will construct an object graph that is used heavily inside. Some clients of the library will only be interested in specific attributes of specific objects, while other clients must traverse the object graph in some way
- Clients may change the objects, and the library must be notified thereof
- The library may change the objects, and the client must be notified thereof, if it already has a handle to that object
- The library must be multi-threaded, because it will maintain network connections to several other hosts
- While some requests to the library may be handled synchronously, many of them will take too long and must be processed in the background, and notify the client on success (or failure)
Of course, answers are welcome no matter if they address my specific requirements, or if they answer the question in a general way that matters to a wider audience!
My assumptions, so far
So here are some of my assumptions and conclusions, which I gathered in the past months:
- Internally I can use whatever I want, e.g. C++ with operator overloading, multiple inheritance, template meta programming… as long as there is a portable compiler which handles it (think of gcc / g++)
- But my interface has to be a clean C interface that does not involve name mangling
- Also, I think my interface should only consist of functions, with basic/primitive data types (and maybe pointers) passed as parameters and return values
- If I use pointers, I think I should only use them to pass them back to the library, not to operate directly on the referenced memory
- For usage in a C++ application, I might also offer an object oriented interface (Which is also prone to name mangling, so the App must either use the same compiler, or include the library in source form)
- Is this also true for usage in C# ?
- For usage in Java SE / Java EE, the Java native interface (JNI) applies. I have some basic knowledge about it, but I should definitely double check it.
- Not all client languages handle multithreading well, so there should be a single thread talking to the client
- For usage on Java ME, there is no such thing as JNI, but I might go with Nested VM
- For usage in Bash scripts, there must be an executable with a command line interface
- For the other client languages, I have no idea
- For most client languages, it would be nice to have kind of an adapter interface written in that language. I think there are tools to automatically generate this for Java and some others
- For object oriented languages, it might be possible to create an object oriented adapter which hides the fact that the interface to the library is function based – but I don’t know if its worth the effort
Possible subquestions
- is this possible with manageable effort, or is it just too much portability?
- are there any good books / websites about this kind of design criteria?
- are any of my assumptions wrong?
- which open source libraries are worth studying to learn from their design / interface / souce?
- meta: This question is rather long, do you see any way to split it into several smaller ones? (If you reply to this, do it as a comment, not as an answer)
Mostly correct. Straight procedural interface is the best. (which is not entirely the same as C btw(**), but close enough)
I interface DLLs a lot(*), both open source and commercial, so here are some points that I remember from daily practice, note that these are more recommended areas to research, and not cardinal truths:
(*) Delphi programmer by day, a job that involves interfacing a lot of hardware and thus translating vendor SDK headers. By night Free Pascal developer, in charge of, among others, the Windows headers.
(**)
This is because what “C” means binary is still dependant on the used C compiler, specially if there is no real universal system ABI. Think of stuff like:
integer registers or not if a parameter is registerable in a FPU register)
===== automated header conversions ====
While I don’t know SWIG that well, I know and use some delphi specific header tools( h2pas, Darth/headconv etc).
However I never use them in fully automatic mode, since more often then not the output sucks. Comments change line or are stripped, and formatting is not retained.
I usually make a small script (in Pascal, but you can use anything with decent string support) that splits a header up, and then try a tool on relatively homogeneous parts (e.g. only structures, or only defines etc).
Then I check if I like the automated conversion output, and either use it, or try to make a specific converter myself. Since it is for a subset (like only structures) it is often way easier than making a complete header converter. Of course it depends a bit what my target is. (nice, readable headers or quick and dirty). At each step I might do a few substitutions (with sed or an editor).
The most complicated scheme I did for Winapi commctrl and ActiveX/comctl headers. There I combined IDL and the C header (IDL for the interfaces, which are a bunch of unparsable macros in C, the C header for the rest), and managed to get the macros typed for about 80% (by propogating the typecasts in sendmessage macros back to the macro declaration, with reasonable (wparam,lparam,lresult) defaults)
The semi automated way has the disadvantage that the order of declarations is different (e.g. first constants, then structures then function declarations), which sometimes makes maintenance a pain. I therefore always keep the original headers/sdk to compare with.
The Jedi winapi conversion project might have more info, they translated about half of the windows headers to Delphi, and thus have enormous experience.