This code loops forever:
#include <iostream>
#include <fstream>
#include <sstream>
int main(int argc, char *argv[])
{
std::ifstream f(argv[1]);
std::ostringstream ostr;
while(f && !f.eof())
{
char b[5000];
std::size_t read = f.readsome(b, sizeof b);
std::cerr << "Read: " << read << " bytes" << std::endl;
ostr.write(b, read);
}
}
It’s because readsome is never setting eofbit.
cplusplus.com says:
Errors are signaled by modifying the internal state flags:
eofbitThe get pointer is at the end of the stream buffer’s internal input
array when the function is called, meaning that there are no positions to be
read in the internal buffer (which may or not be the end of the input
sequence). This happens whenrdbuf()->in_avail()would return-1before the
first character is extracted.
failbitThe stream was at the end of the source of characters before the
function was called.
badbitAn error other than the above happened.
Almost the same, the standard says:
[C++11: 27.7.2.3]:streamsize readsome(char_type* s, streamsize n);32. Effects: Behaves as an unformatted input function (as described in
27.7.2.3, paragraph 1). After constructing a sentry object, if!good()calls
setstate(failbit)which may throw an exception, and return. Otherwise extracts
characters and stores them into successive locations of an array whose first
element is designated bys. Ifrdbuf()->in_avail() == -1, calls
setstate(eofbit)(which may throwios_base::failure(27.5.5.4)), and extracts
no characters;
- If
rdbuf()->in_avail() == 0, extracts no characters- If
rdbuf()->in_avail() > 0, extractsmin(rdbuf()->in_avail(),n)).33. Returns: The number of characters extracted.
That the in_avail() == 0 condition is a no-op implies that ifstream::readsome itself is a no-op if the stream buffer is empty, but the in_avail() == -1 condition implies that it will set eofbit when some other operation has led to in_avail() == -1.
This seems like an inconsistency, even despite the “some” nature of readsome.
So what are the semantics of readsome and eof? Have I interpreted them correctly? Are they an example of poor design in the streams library?
(Stolen from the [IMO] invalid libstdc++ bug 52169.)
I think this is a customization point, not really used by the default stream implementations.
in_avail()returns the number of chars it can see in the internal buffer, if any. Otherwise it callsshowmanyc()to try to detect if chars are known to be available elsewhere, so a buffer fill request is guaranteed to succeed.In turn,
showmanyc()will return the number of chars it knows about, if any, or -1 if it knows that a read will fail, or 0 if it doesn’t have a clue.The default implementation (
basic_streambuf) always returns 0, so that is what you get unless you have a stream with some other streambuf overridingshowmanyc.Your loop is essentially read-as-many-chars-as-you-know-is-safe, and it gets stuck when that is zero (meaning “not sure”).