I know C is purposefully bare-bones, but I’m curious as to why something as commonplace as a substring function is not included in <string.h>.
Is it that there is not one “right enough” way to do it? Too many domain specific requirements? Can anyone shed any light?
BTW, this is the substring function I came up with after a bit of research.
Edit: I made a few updates based on comments.
void substr (char *outStr, const char *inpStr, int startPos, size_t strLen) {
/* Cannot do anything with NULL. */
if (inpStr == NULL || outStr == NULL) return;
size_t len = strlen (inpStr);
/* All negative positions to go from end, and cannot
start before start of string, force to start. */
if (startPos < 0) {
startPos = len + startPos;
}
if (startPos < 0) {
startPos = 0;
}
/* Force negative lengths to zero and cannot
start after end of string, force to end. */
if ((size_t)startPos > len) {
startPos = len;
}
len = strlen (&inpStr[startPos]);
/* Adjust length if source string too short. */
if (strLen > len) {
strLen = len;
}
/* Copy string section */
memcpy(outStr, inpStr+startPos, strLen);
outStr[strLen] = '\0';
}
Edit: Based on a comment from r I also came up with this one liner. You’re on your own for checks though!
#define substr(dest, src, startPos, strLen) snprintf(dest, BUFF_SIZE, "%.*s", strLen, src+startPos)
Basic standard library functions don’t burden themselves with excessive expensive safety checks, leaving them to the user. Most of the safety checks you carry out in your implementation are of expensive kind: totally unacceptable in such a basic library function. This is C, not Java.
Once you get some checks out of the picture, the “substrung” function boils down to ordinary
strlcpy. I.e ignoring the safety check onstartPos, all you need to do isWhile
strlcpyis not a part of the standard library, but it can be crudely replaced by a [misused]strncpy. Again, ignoring the safety check onstartPos, all you need to do isIronically, in your code
strncpyis misused in the very same way. On top of that, many of your safety checks are the direct consequence of your choosing a signed type (int) to represent indices, while proper type would be an unsigned one (size_t).