Why the c++’s implemented string::find() doesn’t use the KMP algorithm (and doesn’t run in O(N + M)) and runs in O(N * M)? Is that corrected in C++0x?
If the complexity of current find is not O(N * M), what is that?
so what algorithm is implemented in gcc? is that KMP? if not, why?
I’ve tested that and the running time shows that it runs in O(N * M)
I assume you mean
find(), rather thansubstr()which doesn’t need to search and should run in linear time (and only because it has to copy the result into a new string).The C++ standard doesn’t specify implementation details, and only specifies complexity requirements in some cases. The only complexity requirements on
std::stringoperations are thatsize(),max_size(),operator[],swap(),c_str()anddata()are all constant time. The complexity of anything else depends on the choices made by whoever implemented the library you’re using.The most likely reason for choosing a simple search over something like KMP is to avoid needing extra storage. Unless the string to be found is very long, and the string to search contains a lot of partial matches, the time taken to allocate and free that would likely be much more than the cost of the extra complexity.
No, C++11 doesn’t add any complexity requirements to
std::string, and certainly doesn’t add any mandatory implementation details.That’s the worst-case complexity, when the string to search contains a lot of long partial matches. If the characters have a reasonably uniform distribution, then the average complexity would be closer to
O(N). So by choosing an algorithm with better worst-case complexity, you may well make more typical cases much slower.