I have a simple test program (error checks removed):
#include <iostream>
#include <iomanip>
#include <sstream>
#include <string>
int main() {
std::string line;
while(std::cin >> line) {
int value;
std::stringstream stream(line);
stream >> std::setbase(0) >> value;
std::cout << "You typed: " << value << std::endl;
}
}
Which works great for prefix-dependent integer parsing. It’ll parse strings starting with "0x" or "0X" as hexadecimal and strings starting with '0' as octal. This is explained in several resources that I use and have seen. What I haven’t been able to find though, is an indication in the C++ standard that this is guaranteed to work.
Section 7.20.1.4.3 on strtol in the C standard says (6.4.4.1 is the syntax for integer constants) I imagine the extraction operators use this under the hood:
If the value of base is zero, the expected form of the subject sequence is that of an
integer constant as described in 6.4.4.1, optionally preceded by a plus or minus sign, but
not including an integer suffix.
This works on the couple of versions of GCC that I’ve tried, but is it safe to use generally?
setbaseis defined in C++98 [lib.std.manip]/5, paraphrasing slightlyReturns: An object
sof unspecified type such that [inserting or extractingsfrom a stream behaves as if the following function were called on that stream:]Okay, so, if
baseis not 8, 10, or 16, then thebasefieldflags are cleared. The effect of a clearedbasefieldfor input is defined in [lib.facet.num.get.virtuals], table 55 (“Integer conversions”) as equivalent tosscanf("%i")on the sequence of characters next available.C++98 refers to C89 for the definition of
*scanf, naturally enough. I don’t have a PDF copy of C89, but I do have C99, in which section 7.19.6.2 paragraph 12 [the C standard does not have the nice symbolic section names that the C++ standard has] defines"%i"to behave the same asstrtolwith base argument 0.So the good news is, prefix-dependent integer scanning is guaranteed by the standard after
setbase(0). The bad news is, iostream formatted input is defined in terms of*scanf, which means the dreadful sentence at the end of C99 7.19.6.2p10 applies:(Emphasis mine.) Clearer version of that sentence: input overflow triggers undefined behavior. The C(++) runtime is allowed to crash the program if input to
*scanfhas too many digits! This is (one of several reasons) why I and others keep saying*scanfshould never be used, and now I have to start saying it aboutistream >> intas well. 🙁The advice that holds for C is even easier to apply in C++: Read entire lines with
std::getlineand parse them by hand. Use thestrtolfamily of functions to convert numeric input to machine numbers. (Those functions have predictable behavior on overflow.)