I’m a programming student in my first C++ class, and recently we were encouraged to write a simple recursive function to find the first occurrence of a substring in a given string. If found, it returns the index. If the substring is not found, the index_of() function should return -1. We are encouraged to use a helper function that takes the index as one of its parameters, and this is what I’ve tried.
For example:
int index_of("Mississippi", "sip"); // this would return a 6
This is supposed to be a simple exercise to help us understand recursion and won’t be turned in. My professor stated that our actual assignment with recursion will be much more involved, which is why I really want to understand this simple use of recursion.
I’ve done this successfully using C-style strings and pointers, but not with C++ std::string objects. What am I doing wrong in my program? My professor stated we should easily be able to write this in 5 mins, but I’ve been struggling with it for two hours. Here’s what I’ve done so far:
int index_of(string s, string t)
{
int index = 0;
if (s[index] == NULL)
return -1;
else if (starts_with(s, t, ++index))
{
return index;
}
else
return index;
}
bool starts_with(string s, string t, int index)
{
if (t[index] == NULL)
return true;
if ( s[index] == NULL || t[0] != s[index])
return false;
return starts_with(s, t, ++index);
}
As written, this function always returns an index of 1.
Full stop. This isn’t how C++’s strings work and you must fix this if you want to use them. Even with C-style strings, don’t use NULL to mean the ASCII null character. They share a name but have different purposes, and you should not use NULL to mean integer zero (chars are integer types and the null character is their zero value). Use
'\0'or justif (s[index]).However, you aren’t allowed to index a std::string unless you know the index is valid. To do that, compare the index against
s.size()(and make sure it’s greater than or equal to 0). Even so, what you are really testing here is if s is empty, and it has a special method to do that:Continuing:
Increment and decrement inside expressions, especially as here, can be confusing for the beginner with no advantage. The main advantage of them is code that is succinct and clear, but you have to already understand the main part of the code first, and even then experienced programmers sometimes benefit from being a tiny bit more verbose.
Anecdotally, Go’s creators, who were also involved in early C history, even turned increment from an expression into a statement, and I believe clarity is a large part of the reason.
From the beginning
You want to implement a function with this signature:
I include those comments with the signature on purpose: they are part of the public interface for this function. Better parameter names also increase clarity.
Identify the cases you need to consider:
And when there is a match of the first character, you have two sub-cases:
I’ve written these as a recursive algorithm which receives "new copies" of each string (and substring) instead of using indices. However, you can transform to use indices by changing "first character" to "current character", and similarly for the "empty" conditions. You will want to use two indices in that case (and trying to only use one may have been a major stumbling block for you so far), unless you have a helping function to compare substrings (though I’m unsure if your professor had a separate intention with this comment).
A direct translation of the above prose into code:
The broken needle comment hints at how inefficient that code is, as it bifurcates the recursive calls into two categories: must match at 1 (which is 0 after slicing into substrings), at mark B, and can match anywhere, at mark A. We can improve this with a helper function, and I’ll use std::string’s operator== overload (operating on a substring of haystack) for that. This yields the recursive equivalent of the classical "naive strstr":
And when using an index for haystack with string::compare as the helper so a needle index isn’t required:
Notice this version is tail-recursive, but this is still a naive algorithm and more advanced ones exist.
You’ve said this is helped you a lot, but, even with the additional examples I just included, it seems lacking to me. Substring-search is not a good recursion exercise, in my opinion, and that could be why.