I wrote a function which percent-encodes a string, as follows:
string percent_encode(string str)
{
string reserved =
// gen-delims
":/?#[]@"
// sub-delims
"!$&'()*+,;="
;
for(string::iterator i = str.begin(); i < str.end(); i++) {
int c = *i;
// replaces reserved, unreserved non-ascii and space characters.
if(c > 127 || c == 32 || reserved.find(*i) != string::npos) {
std::stringstream ss;
ss << std::hex << c;
str.replace(i, i + 1, "%" + ss.str());
}
}
return str;
}
When I call this function for a string like “a&b”, an out_of_range exception is thrown:
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::replace
I traced this exception with a debugger and saw, that the replacing worked well, but it iterates somehow beyond end();
This is what I get, when I watch the iterator “i”:
{_M_current = 0x7fc43d61bd78 "a&b"}
{_M_current = 0x7fc43d61bd79 "&b"}
{_M_current = 0x7fc43d61bd7a "b"}
{_M_current = 0x7fc43d61bd7b ""}
{_M_current = 0x7fc43d61bd7c "o = a&b\n"}
{_M_current = 0x7fc43d61bd7d " = a&b\n"}
Then it tries to replace “=” and fails with an out_of_range exception.
I do not understand, how it is possible for the iterator to get obviously beyond end().
I would be appreciated, if someone could explain me, how this is possible, because I could not find someone in the web, who had the same problem.
Thanks and regards,
reeaal
Edit:
Argh, I really thought to complicated. x)
This is how I solved it now.
string percent_encode(string str)
{
string reserved =
// gen-delims
":/?#[]@"
// sub-delims
"!$&'()*+,;="
;
std::stringstream ss;
for(string::iterator i = str.begin(); i < str.end(); i++) {
// encodes reserved, unreserved non-ascii and space characters.
int c = *i;
if(c > 126 || c == 32 || reserved.find(*i) != string::npos) {
ss << '%' << std::hex << c;
} else {
ss << *i;
}
}
return ss.str();
}
Thanks Diego 🙂
replaceinvalidates current iterator, so it may go beyond the end.There are several ways of writing this code right. For example, generating (and returning) a new string would be the easier, and maybe even more efficient (note that replace has to move the rest of the string one position too). Also, playing with updated string length and position with indices.
But the option of returning a completely new string is the best I can think of. Much more functional 🙂