I’m starting a new project in plain C (c99) that is going to work primarily with text. Because of external project constraints, this code has to be extremely simple and compact, consisting of a single source-code file without external dependencies or libraries except for libc and similar ubiquitous system libraries.
With that understanding, what are some best-practices, gotchas, tricks, or other techniques that can help make the string handling of the project more robust and secure?
Without any additional information about what your code is doing, I would recommend designing all your interfaces like this:
with semantics like
snprintf:destpoints to a buffer of size at leastbuf_size.buf_sizeis zero, null/invalid pointers are acceptable fordestand nothing will be written.buf_sizeis non-zero,destis always null-terminated.foobarreturns the length of the full non-truncated output; the output has been truncated ifbuf_sizeis less than or equal to the return value.This way, when the caller can easily know the destination buffer size that’s required, a sufficiently large buffer can be obtained in advance. If the caller cannot easily know, it can call the function once with either a zero argument for
buf_size, or with a buffer that’s “probably big enough” and only retry if you ran out of space.You can also make a wrapped version of such calls analogous to the GNU
asprintffunction, but if you want your code to be as flexible as possible I would avoid doing any allocation in the actual string functions. Handling the possibility of failure is always easier at the caller level, and many callers can ensure that failure is never a possibility by using a local buffer or a buffer that was obtained much earlier in the program so that the success or failure of a larger operation is atomic (which greatly simplifies error handling).