This is a follow up of this question. Suppose I write a C++ interface that accepts or returns a const string. I can use a const char* zero-terminated string:
void f(const char* str); // (1)
The other way would be to use an std::string:
void f(const string& str); // (2)
It’s also possible to write an overload and accept both:
void f(const char* str); // (3)
void f(const string& str);
Or even a template in conjunction with boost string algorithms:
template<class Range> void f(const Range& str); // (4)
My thoughts are:
- (1) is not C++ish and may be less efficient when subsequent operations may need to know the string length.
- (2) is bad because now
f("long very long C string");invokes a construction of std::string which involves a heap allocation. Iffuses that string just to pass it to some low-level interface that expects a C-string (like fopen) then it is just a waste of resources. - (3) causes code duplication. Although one
fcan call the other depending on what is the most efficient implementation. However we can’t overload based on return type, like in case of std::exception::what() that returns a const char*. - (4) doesn’t work with separate compilation and may cause even larger code bloat.
- Choosing between (1) and (2) based on what’s needed by the implementation is, well, leaking an implementation detail to the interface.
The question is: what is the preffered way? Is there any single guideline I can follow? What’s your experience?
Edit: There is also a fifth option:
void f(boost::iterator_range<const char*> str); // (5)
which has the pros of (1) (doesn’t need to construct a string object) and (2) (the size of the string is explicitly passed to the function).
For taking a parameter I would go with whatever is simplest and often that is
const char*. This works with string literals with zero cost and retrieving aconst char*from something stored in astd:stringis typically very low cost as well.Personally, I wouldn’t bother with the overload. In all but the simplest cases you will want to merge to two code paths and have one call the other at some point or both call a common function. It could be argued that having the overload hides whether one is converted to the other or not and which path has a higher cost.
Only if I actually wanted to use
constfeatures of thestd::stringinterface inside the function would I haveconst std::string&in the interface itself and I’m not sure that just usingsize()would be enough of a justification.In many projects, for better or worse, alternative string classes are often used. Many of these, like
std::stringgive cheap access to a zero-terminatedconst char*; converting to astd::stringrequires a copy. Requiring aconst std::string&in the interface is dictating a storage strategy even when the internals of the function don’t need to specify this. I consider it this to be undesirable, much like taking aconst shared_ptr<X>&dictates a storage strategy whereas takingX&, if possible, allows the caller to use any storage strategy for a passed object.The disadvantages of a
const char*are that, purely from an interface standpoint, it doesn’t enforce non-nullness (although very occasionally the difference betweem a null parameter and an empty string is used in some interfaces – this can’t be done withstd::string), and aconst char*might be the address of just a single character. In practice, though, the use of aconst char*to pass a string is so prevalent that I would consider citing this as a negative to be a fairly trivial concern. Other concerns, such as whether the encoding of the characters specified in the interface documentation (applies to bothstd::stringandconst char*) are much more important and likely to cause more work.