I’m timing the difference between various ways to print text to standard output. I’m testing cout, printf, and ostringstream using both \n and std::endl. I expected std::endl to make a difference with cout (and it did), but I didn’t expect it to slow down output with ostringstream. I thought using std::endl would just write a \n to the stream and it would still only get flushed once. What’s going on here? Here’s all my code:
// cout.cpp
#include <iostream>
using namespace std;
int main() {
for (int i = 0; i < 10000000; i++) {
cout << "Hello World!\n";
}
return 0;
}
// printf.cpp
#include <stdio.h>
int main() {
for (int i = 0; i < 10000000; i++) {
printf("Hello World!\n");
}
return 0;
}
// stream.cpp
#include <iostream>
#include <sstream>
using namespace std;
int main () {
ostringstream ss;
for (int i = 0; i < 10000000; i++) {
ss << "stream" << endl;
}
cout << ss.str();
}
// streamn.cpp
#include <iostream>
#include <sstream>
using namespace std;
int main () {
ostringstream ss;
for (int i = 0; i < 10000000; i++) {
ss << "stream\n";
}
cout << ss.str();
}
And here’s my Makefile
SHELL:=/bin/bash
all: cout.cpp printf.cpp
g++ cout.cpp -o cout.out
g++ printf.cpp -o printf.out
g++ stream.cpp -o stream.out
g++ streamn.cpp -o streamn.out
time:
time ./cout.out > output.txt
time ./printf.out > output.txt
time ./stream.out > output.txt
time ./streamn.out > output.txt
Here’s what I get when I run make followed by make time
time ./cout.out > output.txt
real 0m1.771s
user 0m0.616s
sys 0m0.148s
time ./printf.out > output.txt
real 0m2.411s
user 0m0.392s
sys 0m0.172s
time ./stream.out > output.txt
real 0m2.048s
user 0m0.632s
sys 0m0.220s
time ./streamn.out > output.txt
real 0m1.742s
user 0m0.404s
sys 0m0.200s
These results are consistent.
std::endltriggers a flush of the stream, which slows down printing a lot. See http://en.cppreference.com/w/cpp/io/manip/endlIt is often recommended to not use
std::endlunless you really want the stream to be flushed. If this is really important to you, depends on your use case.Regarding why
flushhas a performance impact even on a ostringstream (where no flushing should happen): It seems that an implementation is required to at least construct the sentry objects. Those need to checkgoodandtieof theostream. The call topubsyncshould be able to be optimized out. This is based on my reading of libcpp and libstdc++.After some more reading the interesting question seems to be this: Is an implementation of
basic_ostringstream::flushreally required to construct the sentry object? If not, this seems like a “quality of implementation” issues to me. But I actually think it needs to because even abasic_stringbugcan change to have itsbadbitset.